little words - big meanings (in mt syntactic transfer)
DESCRIPTION
TRANSCRIPT
English-Romanian Phrase AlignmentFunction words = syntactic glue for sentences
English-Romanian Parallel Sequences with Syntactic ConstituentsEnglish Syntactic Sequences with FW
Little words - Big meanings(in MT syntactic transfer)
Mihaela ColhonUniversity of Craiova
Departament of Computer Science
April 25, 2012
Mihaela Colhon University of Craiova Departament of Computer ScienceLittle words - Big meanings (in MT syntactic transfer)
English-Romanian Phrase AlignmentFunction words = syntactic glue for sentences
English-Romanian Parallel Sequences with Syntactic ConstituentsEnglish Syntactic Sequences with FW
Table of contents
English-Romanian Phrase AlignmentRomanian treebank
Function words = syntactic glue for sentencesEnglish Functional Words SequencesRomanian Syntactic Sequences
English-Romanian Parallel Sequences with Syntactic ConstituentsADVP, CC, CD, DTDT(cont.), ININ(cont.), JJ, MDMD(cont.), NN, NNS, NP, PDT, PP, PRP, RB, S, SBARTO, VBG, VP, WDT, WRB
English Syntactic Sequences with FW[DT NN NN][IN/as, NP][IN/at, NP][IN/by, NP][IN/for, NP][IN/of, NP]
Mihaela Colhon University of Craiova Departament of Computer ScienceLittle words - Big meanings (in MT syntactic transfer)
English-Romanian Phrase AlignmentFunction words = syntactic glue for sentences
English-Romanian Parallel Sequences with Syntactic ConstituentsEnglish Syntactic Sequences with FW
Romanian treebank
Mihaela Colhon University of Craiova Departament of Computer ScienceLittle words - Big meanings (in MT syntactic transfer)
English-Romanian Phrase AlignmentFunction words = syntactic glue for sentences
English-Romanian Parallel Sequences with Syntactic ConstituentsEnglish Syntactic Sequences with FW
Romanian treebank
Accuracy: 87% (for the Romanian part of English-Romanian paralleltreebank) compared with the Romanian chunker annotations.
Token word Treebank tags/chunker annotations Number of matches
vot Ncms−n VP VP NP VP VP SNp Pp
no match
de−asemenea Rgp ADVP VP S ROOTAp
one match
economic Afpms−n ADJP NP NP VP ...Ap Np Pp
two matches
dividende Ncfp−n NP PP VP S ROOTNp Pp
two matches
ın Spsa PP VP PP SAp Vp Pp
three matches
Table : Example of parallel sequences of treebank tags and chunker annotations together with their matchingdegrees
Mihaela Colhon University of Craiova Departament of Computer ScienceLittle words - Big meanings (in MT syntactic transfer)
English-Romanian Phrase AlignmentFunction words = syntactic glue for sentences
English-Romanian Parallel Sequences with Syntactic ConstituentsEnglish Syntactic Sequences with FW
English Functional Words SequencesRomanian Syntactic Sequences
Mihaela Colhon University of Craiova Departament of Computer ScienceLittle words - Big meanings (in MT syntactic transfer)
English-Romanian Phrase AlignmentFunction words = syntactic glue for sentences
English-Romanian Parallel Sequences with Syntactic ConstituentsEnglish Syntactic Sequences with FW
English Functional Words SequencesRomanian Syntactic Sequences
In any syntactic structure we can identify two major categories ofwords:
I Content words which identify objects, entities, properties,relationships or events and syntactically are represented bynouns, adjectives, verbs and adverbs.
I Functional words that help putting words together in acorrect structural sentence form. Also, the functional wordscan tell how words are related to each other. The functionalwords can be determiners, quantifier, prepositions orconnectives.
Mihaela Colhon University of Craiova Departament of Computer ScienceLittle words - Big meanings (in MT syntactic transfer)
English-Romanian Phrase AlignmentFunction words = syntactic glue for sentences
English-Romanian Parallel Sequences with Syntactic ConstituentsEnglish Syntactic Sequences with FW
English Functional Words SequencesRomanian Syntactic Sequences
From the English-Romanian Parallel Treebank with SyntacticConstituents, 2120 English Functional Words Constructionstogether with their translations in Romanian were extracted.English Functional words = words that in Penn POS Tagsetformalism have one of the following tags: CC, DT, IN, MD,PRP, PP$, RP, TO, WDT, WP, WP$, WRB.
English syntactic constructions with functional words:[ { Phrasal−Tag }∗ Pos−Tag/FW { Phrasal−Tag}∗ ]where by FW we note a functional word
Examples:[NP, PRN, CC/and, NP][RB, JJ, CC/and, JJ]
Mihaela Colhon University of Craiova Departament of Computer ScienceLittle words - Big meanings (in MT syntactic transfer)
English-Romanian Phrase AlignmentFunction words = syntactic glue for sentences
English-Romanian Parallel Sequences with Syntactic ConstituentsEnglish Syntactic Sequences with FW
English Functional Words SequencesRomanian Syntactic Sequences
From the English-Romanian Parallel Treebank with SyntacticConstituents, 2120 English Functional Words Constructionstogether with their translations in Romanian were extracted.English Functional words = words that in Penn POS Tagsetformalism have one of the following tags: CC, DT, IN, MD,PRP, PP$, RP, TO, WDT, WP, WP$, WRB.
English syntactic constructions with functional words:[ { Phrasal−Tag }∗ Pos−Tag/FW { Phrasal−Tag}∗ ]where by FW we note a functional word
Examples:[NP, PRN, CC/and, NP][RB, JJ, CC/and, JJ]
Mihaela Colhon University of Craiova Departament of Computer ScienceLittle words - Big meanings (in MT syntactic transfer)
English-Romanian Phrase AlignmentFunction words = syntactic glue for sentences
English-Romanian Parallel Sequences with Syntactic ConstituentsEnglish Syntactic Sequences with FW
English Functional Words SequencesRomanian Syntactic Sequences
From the English-Romanian Parallel Treebank with SyntacticConstituents, 2120 English Functional Words Constructionstogether with their translations in Romanian were extracted.English Functional words = words that in Penn POS Tagsetformalism have one of the following tags: CC, DT, IN, MD,PRP, PP$, RP, TO, WDT, WP, WP$, WRB.
English syntactic constructions with functional words:[ { Phrasal−Tag }∗ Pos−Tag/FW { Phrasal−Tag}∗ ]where by FW we note a functional word
Examples:[NP, PRN, CC/and, NP][RB, JJ, CC/and, JJ]
Mihaela Colhon University of Craiova Departament of Computer ScienceLittle words - Big meanings (in MT syntactic transfer)
English-Romanian Phrase AlignmentFunction words = syntactic glue for sentences
English-Romanian Parallel Sequences with Syntactic ConstituentsEnglish Syntactic Sequences with FW
English Functional Words SequencesRomanian Syntactic Sequences
Following the same representations, the correspondingRomanian translations of the English Functional WordsConstructions are encoded in the same format.
Romanian Functional Words = words that inMULTEXT-EAST Tagset formalism have one of the followingtags: Pd−, Pi−, Ps−, Px−, Pz−, D−, T−, S−, C−, Q−.
Romanian syntactic constructions:[ { Phrasal−Tag }∗ MULTEXT-EastTag/FW { Phrasal−Tag}∗ ]where by FW we note a functional word
Examples:[Di3-po—e/altor, NP][VP, Crssp/si, Tsfs/a, NP, PUNCT]
Mihaela Colhon University of Craiova Departament of Computer ScienceLittle words - Big meanings (in MT syntactic transfer)
English-Romanian Phrase AlignmentFunction words = syntactic glue for sentences
English-Romanian Parallel Sequences with Syntactic ConstituentsEnglish Syntactic Sequences with FW
English Functional Words SequencesRomanian Syntactic Sequences
Following the same representations, the correspondingRomanian translations of the English Functional WordsConstructions are encoded in the same format.
Romanian Functional Words = words that inMULTEXT-EAST Tagset formalism have one of the followingtags: Pd−, Pi−, Ps−, Px−, Pz−, D−, T−, S−, C−, Q−.
Romanian syntactic constructions:[ { Phrasal−Tag }∗ MULTEXT-EastTag/FW { Phrasal−Tag}∗ ]where by FW we note a functional word
Examples:[Di3-po—e/altor, NP][VP, Crssp/si, Tsfs/a, NP, PUNCT]
Mihaela Colhon University of Craiova Departament of Computer ScienceLittle words - Big meanings (in MT syntactic transfer)
English-Romanian Phrase AlignmentFunction words = syntactic glue for sentences
English-Romanian Parallel Sequences with Syntactic ConstituentsEnglish Syntactic Sequences with FW
English Functional Words SequencesRomanian Syntactic Sequences
Following the same representations, the correspondingRomanian translations of the English Functional WordsConstructions are encoded in the same format.
Romanian Functional Words = words that inMULTEXT-EAST Tagset formalism have one of the followingtags: Pd−, Pi−, Ps−, Px−, Pz−, D−, T−, S−, C−, Q−.
Romanian syntactic constructions:[ { Phrasal−Tag }∗ MULTEXT-EastTag/FW { Phrasal−Tag}∗ ]where by FW we note a functional word
Examples:[Di3-po—e/altor, NP][VP, Crssp/si, Tsfs/a, NP, PUNCT]
Mihaela Colhon University of Craiova Departament of Computer ScienceLittle words - Big meanings (in MT syntactic transfer)
English-Romanian Phrase AlignmentFunction words = syntactic glue for sentences
English-Romanian Parallel Sequences with Syntactic ConstituentsEnglish Syntactic Sequences with FW
ADVP, CC, CD, DTDT(cont.), ININ(cont.), JJ, MDMD(cont.), NN, NNS, NP, PDT, PP, PRP, RB, S, SBARTO, VBG, VP, WDT, WRB
Figure : The resulted parallel sequences were saved into a DataBase withfour fields: SynPhrase En, SynPhrase RO, Treebank EN, Treebank RO.
Mihaela Colhon University of Craiova Departament of Computer ScienceLittle words - Big meanings (in MT syntactic transfer)
English-Romanian Phrase AlignmentFunction words = syntactic glue for sentences
English-Romanian Parallel Sequences with Syntactic ConstituentsEnglish Syntactic Sequences with FW
ADVP, CC, CD, DTDT(cont.), ININ(cont.), JJ, MDMD(cont.), NN, NNS, NP, PDT, PP, PRP, RB, S, SBARTO, VBG, VP, WDT, WRB
Mihaela Colhon University of Craiova Departament of Computer ScienceLittle words - Big meanings (in MT syntactic transfer)
English-Romanian Phrase AlignmentFunction words = syntactic glue for sentences
English-Romanian Parallel Sequences with Syntactic ConstituentsEnglish Syntactic Sequences with FW
ADVP, CC, CD, DTDT(cont.), ININ(cont.), JJ, MDMD(cont.), NN, NNS, NP, PDT, PP, PRP, RB, S, SBARTO, VBG, VP, WDT, WRB
Mihaela Colhon University of Craiova Departament of Computer ScienceLittle words - Big meanings (in MT syntactic transfer)
English-Romanian Phrase AlignmentFunction words = syntactic glue for sentences
English-Romanian Parallel Sequences with Syntactic ConstituentsEnglish Syntactic Sequences with FW
ADVP, CC, CD, DTDT(cont.), ININ(cont.), JJ, MDMD(cont.), NN, NNS, NP, PDT, PP, PRP, RB, S, SBARTO, VBG, VP, WDT, WRB
Mihaela Colhon University of Craiova Departament of Computer ScienceLittle words - Big meanings (in MT syntactic transfer)
English-Romanian Phrase AlignmentFunction words = syntactic glue for sentences
English-Romanian Parallel Sequences with Syntactic ConstituentsEnglish Syntactic Sequences with FW
ADVP, CC, CD, DTDT(cont.), ININ(cont.), JJ, MDMD(cont.), NN, NNS, NP, PDT, PP, PRP, RB, S, SBARTO, VBG, VP, WDT, WRB
Mihaela Colhon University of Craiova Departament of Computer ScienceLittle words - Big meanings (in MT syntactic transfer)
English-Romanian Phrase AlignmentFunction words = syntactic glue for sentences
English-Romanian Parallel Sequences with Syntactic ConstituentsEnglish Syntactic Sequences with FW
ADVP, CC, CD, DTDT(cont.), ININ(cont.), JJ, MDMD(cont.), NN, NNS, NP, PDT, PP, PRP, RB, S, SBARTO, VBG, VP, WDT, WRB
Mihaela Colhon University of Craiova Departament of Computer ScienceLittle words - Big meanings (in MT syntactic transfer)
English-Romanian Phrase AlignmentFunction words = syntactic glue for sentences
English-Romanian Parallel Sequences with Syntactic ConstituentsEnglish Syntactic Sequences with FW
[DT NN NN][IN/as, NP][IN/at, NP][IN/by, NP][IN/for, NP][IN/of, NP]
DT + NN → N− − − − y (y: definiteness)
DT + NN → T − − − − + N − − − − n (T: article)
DT + NN → D − − − − − − − − − − + N − − − − n (D: determiner)
Mihaela Colhon University of Craiova Departament of Computer ScienceLittle words - Big meanings (in MT syntactic transfer)
English-Romanian Phrase AlignmentFunction words = syntactic glue for sentences
English-Romanian Parallel Sequences with Syntactic ConstituentsEnglish Syntactic Sequences with FW
[DT NN NN][IN/as, NP][IN/at, NP][IN/by, NP][IN/for, NP][IN/of, NP]
Treebank EN Treebank RO
[PP [IN Rsp/as] [NP [NP Afp/strict] [ADJP [RB
Cs/as] [JJ Afp/possible]]]]
[PP [Rw 14/cat] [Rp 15/mai] [ADJP [Afpfp-n
16/stricte] [ADJP [Rgp 17/posibil]]]]
Mihaela Colhon University of Craiova Departament of Computer ScienceLittle words - Big meanings (in MT syntactic transfer)
English-Romanian Phrase AlignmentFunction words = syntactic glue for sentences
English-Romanian Parallel Sequences with Syntactic ConstituentsEnglish Syntactic Sequences with FW
[DT NN NN][IN/as, NP][IN/at, NP][IN/by, NP][IN/for, NP][IN/of, NP]
Treebank EN Treebank RO
[PP [IN Sp/at] [NP [RBS Pi3-p/most]]] [PP [Rgp 3/maximum]]
[PP [IN Sp/at] [NP [NP [DT Dd/the] [NN Ncns/end]]
[PP [IN Sp/of] [NP [DT Dd/the] [JJ Afp/financial]
[NN Ncns/year]]]]]
[NP [Spsa 1/la] [NP [NP [Ncfsry 2/ıncheierea]] [NP
[Ncmsoy 3/exercitiului] [ADJP [Afpms-n 4/finan-
ciar]]]]]
Mihaela Colhon University of Craiova Departament of Computer ScienceLittle words - Big meanings (in MT syntactic transfer)
English-Romanian Phrase AlignmentFunction words = syntactic glue for sentences
English-Romanian Parallel Sequences with Syntactic ConstituentsEnglish Syntactic Sequences with FW
[DT NN NN][IN/as, NP][IN/at, NP][IN/by, NP][IN/for, NP][IN/of, NP]
Treebank EN Treebank RO
[PP [IN Sp/by] [NP [DT Dd/the] [NN Np/director-
general]]]
[NP [Spca 7/de−catre] [NP [Ncmsry 8/directorul]
[Afpms-n 9/general]]]
[PP [IN Sp/by] [NP [NP [DT Dd/the] [NN
Ncns/agency]] ...]
[NP [Spca 14/de−catre] [NP [NP [Ncfsrn
15/Agentie]] ...]
Mihaela Colhon University of Craiova Departament of Computer ScienceLittle words - Big meanings (in MT syntactic transfer)
English-Romanian Phrase AlignmentFunction words = syntactic glue for sentences
English-Romanian Parallel Sequences with Syntactic ConstituentsEnglish Syntactic Sequences with FW
[DT NN NN][IN/as, NP][IN/at, NP][IN/by, NP][IN/for, NP][IN/of, NP]
Mihaela Colhon University of Craiova Departament of Computer ScienceLittle words - Big meanings (in MT syntactic transfer)
English-Romanian Phrase AlignmentFunction words = syntactic glue for sentences
English-Romanian Parallel Sequences with Syntactic ConstituentsEnglish Syntactic Sequences with FW
[DT NN NN][IN/as, NP][IN/at, NP][IN/by, NP][IN/for, NP][IN/of, NP]
Mihaela Colhon University of Craiova Departament of Computer ScienceLittle words - Big meanings (in MT syntactic transfer)