machine translation marazi to unl presented by ashwini, salil center for indian language technology...
TRANSCRIPT
![Page 1: Machine Translation marazI to UNL Presented by Ashwini, Salil Center for Indian Language Technology Solutions CSE, IIT Powai](https://reader036.vdocuments.net/reader036/viewer/2022083008/56649efe5503460f94c1287c/html5/thumbnails/1.jpg)
Machine Translation
marazI to UNLPresented by
Ashwini, Salil
Center for Indian Language Technology Solutions
CSE, IIT Powai
![Page 2: Machine Translation marazI to UNL Presented by Ashwini, Salil Center for Indian Language Technology Solutions CSE, IIT Powai](https://reader036.vdocuments.net/reader036/viewer/2022083008/56649efe5503460f94c1287c/html5/thumbnails/2.jpg)
Characteristics of marazI
a. Syntactic structure – Subject-object-verb
e.g. rama Baat Katao. – Similarity with Hindi
b. Morphology
– P`a%yaya– Differences with Hindi
![Page 3: Machine Translation marazI to UNL Presented by Ashwini, Salil Center for Indian Language Technology Solutions CSE, IIT Powai](https://reader036.vdocuments.net/reader036/viewer/2022083008/56649efe5503460f94c1287c/html5/thumbnails/3.jpg)
Main tasks
1. Marathi-UW dictionary building
2. Rulebase building for converting Marathi language phenomenon to UNL expressions
3. Testing using corpus sentences
4. Verification with Hindi and Marathi deconverters.
![Page 4: Machine Translation marazI to UNL Presented by Ashwini, Salil Center for Indian Language Technology Solutions CSE, IIT Powai](https://reader036.vdocuments.net/reader036/viewer/2022083008/56649efe5503460f94c1287c/html5/thumbnails/4.jpg)
Analysis consists of
• Morphology
• Syntax
• Semantics
• Pragmatics
![Page 5: Machine Translation marazI to UNL Presented by Ashwini, Salil Center for Indian Language Technology Solutions CSE, IIT Powai](https://reader036.vdocuments.net/reader036/viewer/2022083008/56649efe5503460f94c1287c/html5/thumbnails/5.jpg)
Marathi analysis done so far
We focus on Marathi morphology
• Noun morphology
• Pronoun morphology click
• Verb morphology click
• Relation label morphology click
• Adjective morphology click
![Page 6: Machine Translation marazI to UNL Presented by Ashwini, Salil Center for Indian Language Technology Solutions CSE, IIT Powai](https://reader036.vdocuments.net/reader036/viewer/2022083008/56649efe5503460f94c1287c/html5/thumbnails/6.jpg)
Types of adjectives in Marathi
1. Pronounic adjectives 1.1 Pronoun adjectives: The nine pronouns being used as adjectives.
1.2 Adjectives derived from the nine pronouns
2. Qualitative adjectives 2.1 Adjectives ending with vowel +É 2.2 Adjectives ending with vowels other than +É
2.3 Postposition adjectives
![Page 7: Machine Translation marazI to UNL Presented by Ashwini, Salil Center for Indian Language Technology Solutions CSE, IIT Powai](https://reader036.vdocuments.net/reader036/viewer/2022083008/56649efe5503460f94c1287c/html5/thumbnails/7.jpg)
Type of adjectives [contd.]
3. Numerical adjectives• 3.1 Cardinal
3.1.1 (whole number)3.1.2 (fractional number)
3.1.3 (entirety, totality, completeness)• 3.2 Ordinal• 3.3 Occurrencial
6 types• 3.4 Distinctive
![Page 8: Machine Translation marazI to UNL Presented by Ashwini, Salil Center for Indian Language Technology Solutions CSE, IIT Powai](https://reader036.vdocuments.net/reader036/viewer/2022083008/56649efe5503460f94c1287c/html5/thumbnails/8.jpg)
[pAvaNedonashe] means 175 or 199.75?
- There is no word assigned to 199.75, 299.75, etc. - the problems with paun, pauvane and savva.- (pAvaNedon) times 100 (she). she and shambhar,
both mean 100. pAUNashe means 75. pAvaNeshambhar means 99.75.
- The powers of ten for which there is a distinct word in Marathi need to be stored separately.
- pronunciation is not pAvaNedona-[pause]-she but
pAvaNe -[pause]-donashe
![Page 9: Machine Translation marazI to UNL Presented by Ashwini, Salil Center for Indian Language Technology Solutions CSE, IIT Powai](https://reader036.vdocuments.net/reader036/viewer/2022083008/56649efe5503460f94c1287c/html5/thumbnails/9.jpg)
Tables of numbers: continous and random access.
• Some forms of numbers are used for verbalizing the tables of numbers: ºÉÉiÉ / ºÉÉiÉÉ / ºÉÉiÉä / ºÉÉiÉÒä / ºÉiiÉä.
• Marathi: A, B times, (is C), occurring in the table for A. English: B A’s (are C).
• Usage of forms: 1. only for the expression ‘A’ 2. only for ‘B times’ 3. only while recalling the number directly without going through the table.
• Some forms occur especially for square. The repetition is emphasized.
![Page 10: Machine Translation marazI to UNL Presented by Ashwini, Salil Center for Indian Language Technology Solutions CSE, IIT Powai](https://reader036.vdocuments.net/reader036/viewer/2022083008/56649efe5503460f94c1287c/html5/thumbnails/10.jpg)
words used to familiarise a child with numbers
• Some words are used mostly to familiarise a child with numbers: BEÒ BE, nÖEÔ nÉäxÉ, ÊiÉEÔ iÉÒxÉ, etc. The similarity of each word with the number is used to help a child remember the number. The words used as familiarisers are: BEÒ, nÖEÔ, ÊiÉEÔ, SÉÉèEÒ, {ÉÉSÉÒ, ºÉɽÒ, ºÉÉiÉÒä, +É`Ò, xÉ´Éä, nɽÒ.
![Page 11: Machine Translation marazI to UNL Presented by Ashwini, Salil Center for Indian Language Technology Solutions CSE, IIT Powai](https://reader036.vdocuments.net/reader036/viewer/2022083008/56649efe5503460f94c1287c/html5/thumbnails/11.jpg)
playing cards and game of cricket
1. playing cards:
ekka, durri / durra, tirri / tirra, chavvi / chouka, panji / panja, chhakki / chhakka, satti / satta, atthi / attha, navvi / nashsha, dashshi / dashsha.
2. shots scoring multiple runs in the game of cricket:
SÉÉèEÉ®, ¹É]EÉ®.
![Page 12: Machine Translation marazI to UNL Presented by Ashwini, Salil Center for Indian Language Technology Solutions CSE, IIT Powai](https://reader036.vdocuments.net/reader036/viewer/2022083008/56649efe5503460f94c1287c/html5/thumbnails/12.jpg)
The current status of dictionary
Number of entries 375
•Dictionary click
•Nouns
•Noun morphology suffixes
•Verbs
•Verb morphology suffixes
![Page 13: Machine Translation marazI to UNL Presented by Ashwini, Salil Center for Indian Language Technology Solutions CSE, IIT Powai](https://reader036.vdocuments.net/reader036/viewer/2022083008/56649efe5503460f94c1287c/html5/thumbnails/13.jpg)
The current status of rulebase
Number of rules is 1050.
• Verb morphology (Simple and conjunct verbs) – Tense (Past, Present, Future)– Aspect of tense (Progress, complete, custom)– Voice (Passive voice)
– +lÉÇ (imperative, should, negative)– Ability, intention etc. for conjunct verbs only.
![Page 14: Machine Translation marazI to UNL Presented by Ashwini, Salil Center for Indian Language Technology Solutions CSE, IIT Powai](https://reader036.vdocuments.net/reader036/viewer/2022083008/56649efe5503460f94c1287c/html5/thumbnails/14.jpg)
The current status of rulebase [contd.]
• Noun morphology – Number
– With case marker (ºÉɨÉÉxªÉ° {É)• Case when penultimate vowel is either
> or <Ç e.g. ¨ÉÚ±É - ¨ÉÖ±Éä (Plural)
![Page 15: Machine Translation marazI to UNL Presented by Ashwini, Salil Center for Indian Language Technology Solutions CSE, IIT Powai](https://reader036.vdocuments.net/reader036/viewer/2022083008/56649efe5503460f94c1287c/html5/thumbnails/15.jpg)
The current status of rulebase [contd.]
• Relation labels used so faragt, obj, gol, aoj, and, or
e.g. ¨ÉÖ±ÉÉÆxÉÒ +ÉƤÉä JÉɱ±Éä xÉ´½iÉäÃ.
obj(eat(icl>do).@entry.@pred.@past.@not. @complete, mango(icl>fruit):08.@pl)
agt(eat(icl>do).@entry.@pred.@past.@not. @complete, child(icl>person):00.@pl)
![Page 16: Machine Translation marazI to UNL Presented by Ashwini, Salil Center for Indian Language Technology Solutions CSE, IIT Powai](https://reader036.vdocuments.net/reader036/viewer/2022083008/56649efe5503460f94c1287c/html5/thumbnails/16.jpg)
Plans
• Adjective morphology
• Pronoun morphology
• Relation labels handling for corpus sentences.
For simple sentence only.
![Page 17: Machine Translation marazI to UNL Presented by Ashwini, Salil Center for Indian Language Technology Solutions CSE, IIT Powai](https://reader036.vdocuments.net/reader036/viewer/2022083008/56649efe5503460f94c1287c/html5/thumbnails/17.jpg)
THANK YOU
![Page 18: Machine Translation marazI to UNL Presented by Ashwini, Salil Center for Indian Language Technology Solutions CSE, IIT Powai](https://reader036.vdocuments.net/reader036/viewer/2022083008/56649efe5503460f94c1287c/html5/thumbnails/18.jpg)
References:
•Damle, Moro Keshav (1970). Shastriya marathi vyakarana. [SaswrIya marATI vyAkaraNa]. (Ed: K. S. Arjunwadkar). Pune: Deshmukh & Co.
•Meying, Zhu (2000) EnConverter specifications, version 2.1. Tokyo: UNU/IAS/UNL Center.
• Meying, Zhu (2002) UNL specifications, version 3 edition 1. Tokyo: UNU/IAS/UNL Center.
•Valambe, M. R. (2001) Sugam marathi vyakaran lekhan [sugama marATI vyAkaraNa leKana]. Pune: Nitin Prakashan.