Download - Machine Learning for Machine Translationacmsc/TMW2014/P_bhattacharyya.pdf · Machine Learning for Machine Translation Pushpak Bhattacharyya CSE Dept., IIT Bombay ... Target Text Analysis

Machine Learning for

Machine Translation

Pushpak Bhattacharyya

CSE Dept

IIT Bombay

ISI Kolkata

Jan 6 2014

Jan 6 2014 ISI Kolkata ML for MT 1

NLP Trinity

Algorithm

Problem

Language Hindi

Marathi

English

French Morph Analysis

Part of Speech Tagging

Parsing

Semantics

CRF

HMM

MEMM

NLP Trinity


NLP Layer

Morphology

POS tagging

Chunking

Parsing

Semantics Extraction

Discourse and Corefernce

Increased Complexity Of Processing


Motivation for MT

MT NLP Complete

NLP AI complete

AI CS complete

How will the world be different when the language barrier disappears

Volume of text required to be translated currently exceeds translatorsrsquo capacity (demand gt supply)

Solution automation


Taxonomy of MT systems

MT Approaches

Knowledge Based Rule Based MT

Data driven Machine Learning Based

Example Based MT (EBMT)

Statistical MT

Interlingua Based Transfer Based


Why is MT difficult

Language divergence


Why is MT difficult Language Divergence

One of the main complexities of MT Language Divergence

Languages have different ways of expressing meaning

Lexico-Semantic Divergence

Structural Divergence

Our work on English-IL Language Divergence with illustrations from Hindi (Dave Parikh Bhattacharyya Journal of MT 2002)


Languages differ in expressing thoughts Agglutination

Finnish ldquoistahtaisinkohanrdquo

English I wonder if I should sit down for a whileldquo

Analysis

ist + sit verb stem

ahta + verb derivation morpheme to do something for a while

isi + conditional affix

n + 1st person singular suffix

ko + question particle

han a particle for things like reminder (with declaratives) or softening (with questions and imperatives)


Language Divergence Theory Lexico-

Semantic Divergences (few examples)

Conflational divergence F vomir E to be sick E stab H chure se maaranaa (knife-with hit) S Utrymningsplan E escape plan

Categorial divergence

Change is in POS category

The play is on_PREP (vs The play is Sunday) Khel chal_rahaa_haai_VM (vs khel ravivaar ko haai)


Language Divergence Theory Structural Divergences

SVOSOV E Peter plays basketball H piitar basketball kheltaa haai

Head swapping divergence E Prime Minister of India H bhaarat ke pradhaan mantrii (India-of Prime

Minister)


Language Divergence Theory Syntactic

Divergences (few examples)

Constituent Order divergence

E Singh the PM of India will address the nation today

H bhaarat ke pradhaan mantrii singh hellip (India-of PM Singhhellip)

Adjunction Divergence

E She will visit here in the summer

H vah yahaa garmii meM aayegii (she here summer-in will come)

Preposition-Stranding divergence

E Who do you want to go with

H kisake saath aap jaanaa chaahate ho (who withhellip)


Vauquois Triangle


Kinds of MT Systems (point of entry from source to the target text)


Illustration of transfer SVOSOV

S

NP VP

V N NP

N John eats

bread

S

NP VP

V N

John eats

NP

N

bread

(transfer svo sov)


Universality hypothesis

Universality hypothesis At the

level of ldquodeep meaningrdquo all texts

are the ldquosamerdquo whatever the

language


Understanding the Analysis-Transfer-Generation over Vauquois triangle (14)

H11 सरकार_न चनावो_क_बाद मबई म करो_क_माधयम_स

अपन राजसव_को बढ़ाया | T11 Sarkaar ne chunaawo ke baad Mumbai me karoM ke

maadhyam se apne raajaswa ko badhaayaa

G11 Government_(ergative) elections_after Mumbai_in

taxes_through its revenue_(accusative) increased

E11 The Government increased its revenue after the

elections through taxes in Mumbai


Interlingual representation complete disambiguation

Washington voted Washington to power Vote

past

Washington power Washington

emphasis

ltis-a gt action

ltis-a gt place

ltis-a gt capability ltis-a gt hellip

ltis-a gt person

goal


Kinds of disambiguation needed for a complete and correct interlingua graph

N Name

P POS

A Attachment

S Sense

C Co-reference

R Semantic Role


Issues to handle

Sentence I went with my friend John to the bank to withdraw some money but was disappointed to find it closed

ISSUES Part Of Speech Noun or Verb


Issues to handle


ISSUES Part Of Speech

NER

John is the name of a PERSON


Issues to handle



NER

WSD Financial

bank or River bank


Issues to handle



NER

WSD

Co-reference

ldquoitrdquo ldquobankrdquo


Issues to handle



NER

WSD

Co-reference

Subject Drop

Pro drop (subject ldquoIrdquo)


Typical NLP tools used

POS tagger

Stanford Named Entity Recognizer

Stanford Dependency Parser

XLE Dependency Parser

Lexical Resource

WordNet

Universal Word Dictionary (UW++)


System Architecture Stanford

Dependency Parser

XLE Parser

Feature Generation

Attribute Generation

Relation Generation

Simple Sentence Analyser

NER


WSD

Clause Marker

Merger

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simplifier


Target Sentence Generation from interlingua

Lexical Transfer

Target Sentence

Generation

Syntax

Planning

Morphological

Synthesis

(WordPhrase

Translation ) (Word form

Generation) (Sequence)


Generation Architecture Deconversion = Transfer + Generation


Transfer Based MT

Marathi-Hindi


Indian Language to Indian Language Machine Translation (ILILMT)

Bidirectional Machine Translation System

Developed for nine Indian language pairs

Approach

Transfer based

Modules developed using both rule based and statistical approach


Architecture of ILILMT System

Morphological Analyzer

Source Text

POS Tagger

Chunker

Vibhakti Computation

Name Entity Recognizer

Word Sense Disambiguatio

n

Lexical Transfer

Agreement Feature

Interchunk

Word Generator

Intrachunk

Target Text

Analysis

Transfer

Generation


M-H MT system Evaluation

Subjective evaluation based on machine translation quality

Accuracy calculated based on score given by linguists

S5 Number of score 5 Sentences S4 Number of score 4 sentences

S3 Number of score 3 sentences N Total Number of sentences

Accuracy =

Score 5 Correct Translation

Score 4 Understandable with

minor errors


major errors

Score 2 Not Understandable

Score 1 Non sense translation


Evaluation of Marathi to Hindi MT System

0

02

04

06

08

1

12

MorphAnalyzer

POS Tagger Chunker VibhaktiCompute

WSD LexicalTransfer

WordGenerator

Precision

Recall

Module-wise precision and recall


Evaluation of Marathi to Hindi MT System (cont)

Subjective evaluation on translation quality Evaluated on 500 web sentences Accuracy calculated based on score given according to the

translation quality Accuracy 6532

Result analysis

Morph POS tagger chunker gives more than 90 precision but Transfer WSD generator modules are below 80 hence degrades MT quality

Also morph disambiguation parsing transfer grammar and FW disambiguation modules are required to improve accuracy


Statistical Machine Translation


Czeck-English data

[nesu] ldquoI carryrdquo

[ponese] ldquoHe will carryrdquo

[nese] ldquoHe carriesrdquo

[nesou] ldquoThey carryrdquo

[yedu] ldquoI driverdquo

[plavou] ldquoThey swimrdquo


To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data

[DhotA huM] ldquoI carryrdquo

[DhoegA] ldquoHe will carryrdquo

[DhotA hAi] ldquoHe carriesrdquo

[Dhote hAi] ldquoThey carryrdquo

[chalAtA huM] ldquoI driverdquo

[tErte hEM] ldquoThey swimrdquo


Bangla-English data

[bai] ldquoI carryrdquo

[baibe] ldquoHe will carryrdquo

[bay] ldquoHe carriesrdquo

[bay] ldquoThey carryrdquo

[chAlAi] ldquoI driverdquo

[sAMtrAy] ldquoThey swimrdquo


To translate hellip (repeated)

I will carry

They drive

He swims

They will drive


Foundation

Data driven approach Goal is to find out the English sentence e

given foreign language sentence f whose p(e|f) is maximum

Translations are generated on the basis of statistical model

Parameters are estimated using bilingual parallel corpora


SMT Language Model

To detect good English sentences

Probability of an English sentence w1w2 helliphellip wn can be written as

Pr(w1w2 helliphellip wn) = Pr(w1) Pr(w2|w1) Pr(wn|w1 w2 wn-1)

Here Pr(wn|w1 w2 wn-1) is the probability that word wn follows word string w1 w2 wn-1 N-gram model probability

Trigram model probability calculation


SMT Translation Model

P(f|e) Probability of some f given hypothesis English translation e

How to assign the values to p(e|f)

Sentences are infinite not possible to find pair(ef) for all sentences

Introduce a hidden variable a that represents alignments between the individual words in the sentence pair

Sentence level

Word level


Alignment

If the string e= e1l= e1 e2 hellipel has l words and

the string f= f1m=f1f2fm has m words

then the alignment a can be represented by a series a1

m= a1a2am of m values each between 0 and l such that if the word in position j of the f-string is connected to the word in position i of the e-string then aj= i and

if it is not connected to any English word then aj= O


Example of alignment

English Ram went to school

Hindi Raama paathashaalaa gayaa

Ram went to school

ltNullgt Raama paathashaalaa gayaa


Translation Model Exact expression

Five models for estimating parameters in the expression [2]

Model-1 Model-2 Model-3 Model-4 Model-5

Choose alignment given e and m

Choose the identity of foreign word given e m a

Choose the length of foreign language string given e


a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m

emafememaf )|Pr()|Pr()|Pr(

m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(

Proof of Translation Model Exact

expression

m is fixed for a particular f hence

marginalization

marginalization


Alignment


Fundamental and ubiquitous

Spell checking

Translation

Transliteration

Speech to text

Text to speeh


EM for word alignment from

sentence alignment example

English

(1) three rabbits

a b

(2) rabbits of Grenoble

b c d

French

(1) trois lapins

w x

(2) lapins de Grenoble

x y z


Initial Probabilities

each cell denotes t(a w) t(a x) etc

a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14

The counts in IBM Model 1

Works by maximizing P(f|e) over the entire corpus

For IBM Model 1 we get the following relationship

c (w f |w e f e ) =t (w f |w e )

t (w f |we

0 ) +hellip + t (w f |wel )

c (w f |w e f e ) is the fractional count of the alignment of w f

with w e in f and e

t (w f |w e ) is the probability of w f being the translation of w e

is the count of w f in f

is the count of w e in e


Example of expected count

C[aw (a b)(w x)]

t(aw)

= ------------------------- X (a in lsquoa brsquo) X (w in lsquow xrsquo)

t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14


ldquocountsrdquo b c d

x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0


Revised probability example

trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)


Revised probabilities table

a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13

ldquorevised countsrdquo b c d

x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0


Re-Revised probabilities table a b c d

w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13

Continue until convergence notice that (bx) binding gets progressively stronger

b=rabbits x=lapins

Derivation of EM based Alignment Expressions

Hindi)(Say language ofy vocabular

English)(Say language ofry vocalbula

2

1

LV

LV

F

E

what is in a name नाम म कया ह naam meM kya hai name in what is what is in a name That which we call rose by any other name will smell as sweet

जजस हम गऱाब कहत ह और भी ककसी नाम स उसकी कशब सामान मीठा होगी Jise hum gulab kahte hai aur bhi kisi naam se uski khushbu samaan mitha hogii That which we rose say any other name by its smell as sweet That which we call rose by any other name will smell as sweet

E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF

what is in a name that which we call rose by any other will smell as sweet

naam meM kya hai jise hum gulab kahte hai aur bhi kisi bhi uski khushbu saman mitha hogii


Key Notations English vocabulary 119881119864 French vocabulary 119881119865 No of observations sentence pairs 119878 Data 119863 which consists of 119878 observations looks like

11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878

No words on English side in 119904119905119893 sentence 119897119904 No words on French side in 119904119905119893 sentence 119898119904 119894119899119889119890119909119864 119890

119904119901 =Index of English word 119890119904119901in English vocabularydictionary

119894119899119889119890119909119865 119891119904119902 =Index of French word 119891119904119902in French vocabularydictionary

(Thanks to Sachin Pawar for helping with the maths formulae processing)


Hidden variables and parameters

Hidden Variables (Z) Total no of hidden variables = 119897119904 119898119904119878

119904=1 where each hidden variable is

as follows 119911119901119902119904 = 1 if in 119904119905119893 sentence 119901119905119893 English word is mapped to 119902119905119893 French

word 119911119901119902119904 = 0 otherwise

Parameters (Θ) Total no of parameters = 119881119864 times 119881119865 where each parameter is as follows 119875119894119895 = Probability that 119894119905119893 word in English vocabulary is mapped to 119895119905119893

word in French vocabulary


Likelihoods Data Likelihood L(D Θ)

Data Log-Likelihood LL(D Θ)

Expected value of Data Log-Likelihood E(LL(D Θ))


Constraint and Lagrangian

119875119894119895

119881119865

119895=1

= 1 forall119894


Differentiating wrt Pij


Final E and M steps

M-step

E-step


Combinatorial considerations


Example

ISI Kolkata ML for MT Jan 6 2014 67

All possible alignments


First fundamental requirement of SMT

Alignment requires evidence of

bull firstly a translation pair to introduce the

POSSIBILITY of a mapping

bull then another pair to establish with

CERTAINTY the mapping


For the ldquocertaintyrdquo

We have a translation pair containing alignment candidates and none of the other words in the translation pair

OR

We have a translation pair containing all words in the translation pair except the alignment candidates


Thereforehellip

If M valid bilingual mappings exist in a translation pair then an additional M-1 pairs of translations will decide these mappings with certainty


Rough estimate of data requirement

SMT system between two languages L1 and L2 Assume no a-priori linguistic or world knowledge

ie no meanings or grammatical properties of any words phrases or sentences

Each language has a vocabulary of 100000 words can give rise to about 500000 word forms

through various morphological processes assuming each word appearing in 5 different forms on the average For example the word bdquogo‟ appearing in bdquogo‟ bdquogoing‟

bdquowent‟ and bdquogone‟


Reasons for mapping to multiple words

Synonymy on the target side (eg ldquoto gordquo in English translating to ldquojaanaardquo ldquogaman karnaardquo ldquochalnaardquo etc in Hindi) a phenomenon called lexical choice or register

polysemy on the source side (eg ldquoto gordquo translating to ldquoho jaanaardquo as in ldquoher face went red in angerrdquordquousakaa cheharaa gusse se laal ho gayaardquo)

syncretism (ldquowentrdquo translating to ldquogayaardquo ldquogayiirdquo or ldquogayerdquo) Masculine Gender 1st or 3rd person singular number past tense non-progressive aspect declarative mood


Estimate of corpora requirement

Assume that on an average a sentence is 10 words long

an additional 9 translation pairs for getting at one of the 5 mappings

10 sentences per mapping per word

a first approximation puts the data requirement at 5 X 10 X 500000= 25 million parallel sentences

Estimate is not wide off the mark

Successful SMT systems like Google and Bing reportedly use 100s of millions of translation pairs


Our work on factor based SMT

Ananthakrishnan Ramanathan Hansraj Choudhary Avishek Ghosh and Pushpak Bhattacharyya Case markers and Morphology Addressing the crux of the fluency problem in English-Hindi SMT ACL-IJCNLP 2009 Singapore August 2009


Case Marker and Morphology crucial in E-H MT

Order of magnitiude facelift in Fluency and fidelity

Determined by the combination of suffixes and semantic relations on the English side

Augment the aligned corpus of the two languages with the correspondence of English suffixes and semantic relations with Hindi suffixes and case markers


Semantic relations+SuffixesCase

Markers+inflections

I ate mangoes

I ltagt ate eatpast mangoes ltobj

I ltagt mangoes ltobjpl eatpast

mei_ne aam khaa_yaa


Our Approach

Factored model (Koehn and Hoang 2007) with the following translation factor

suffix + semantic relation case markersuffix

Experiments with the following relations

Dependency relations from the stanford parser

Deeper semantic roles from Universal Networking Language (UNL)


Our Factorization


Experiments


Corpus Statistics


Results The impact of suffix and semantic factors


Results The impact of reordering and semantic relations


Subjective Evaluation The impact of reordering and semantic relations


Impact of sentence length (F Fluency AAdequacy E Errors)


A feel for the improvement-baseline


A feel for the improvement-reorder


A feel for the improvement-Semantic relation


A recent study

PAN Indian SMT


Pan-Indian Language SMT httpwwwcfiltiitbacinindic-translator

SMT systems between 11 languages 7 Indo-Aryan Hindi Gujarati Bengali Oriya Punjabi

Marathi Konkani 3 Dravidian languages Malayalam Tamil Telugu English

Corpus Indian Language Corpora Initiative (ILCI) Corpus Tourism and Health Domains 50000 parallel sentences

Evaluation with BLEU METEOR scores also show high correlation with BLEU


SMT Systems Trained

Phrase-based (PBSMT) baseline system (S1)

E-IL PBSMT with Source side reordering rules (Ramanathan et al 2008) (S2)

E-IL PBSMT with Source side reordering rules (Patel et al 2013) (S3)

IL-IL PBSMT with transliteration post-editing (S4)


Natural Partitioning of SMT systems

Clear partitioning of translation pairs by language family pairs based on translation accuracy

Shared characteristics within language families make translation simpler

Divergences among language families make translation difficult

Baseline PBSMT - BLEU scores (S1)


The Challenge of Morphology Morphological complexity vs BLEU

Training Corpus size vs BLEU

Vocabulary size is a proxy for morphological complexity Note For Tamil a smaller corpus was used for computing vocab size

bull Translation accuracy decreases with increasing morphology

bull Even if training corpus is increased commensurate improvement in translation accuracy is not seen for morphologically rich languages

bull Handling morphology in SMT is critical Jan 6 2014 ISI Kolkata ML for MT 93

Common Divergences Shared Solutions

All Indian languages have similar word order The same structural divergence between English and Indian

languages SOVlt-gtSVO etc Common source side reordering rules improve E-IL

translation by 114 (generic) and 186 (Hindi-adapted) Common divergences can be handled in a common

framework in SMT systems ( This idea has been used for knowledge based MT systems eg Anglabharati )

Comparison of source reordering methods for E-IL SMT - BLEU scores (S1S2S3)


Harnessing Shared Characteristics

Out of Vocabulary words are transliterated in a post-editing step Done using a simple transliteration scheme which harnesses the common

phonetic organization of Indic scripts Accuracy Improvements of 05 BLEU points with this simple approach

Harnessing common characteristics can improve SMT output

PBSMT+ transliteration post-editing for E-IL SMT - BLEU scores (S4)


Cognition and Translation Measuring Translation Difficulty

Abhijit Mishra and Pushpak Bhattacharyya Automatically Predicting Sentence Translation Difficulty ACL 2013 Sofia Bulgaria 4-9 August 2013

96 Jan 6 2014 ISI Kolkata ML for MT

Scenario

Sentences

John ate jam

John ate jam made from apples

John is in a jam

Subjective notion of difficulty

Easy

Moderate

Difficult


Use behavioural data

Use behavioural data to decipher strong AI algorithms

Specifically

For WSD by humans see where the eye rests for clues

For the innate translation difficulty of sentences see how the eye moves back and forth over the sentences


Image Courtesy httpwwwsmashingmagazinecom2007100930-usability-issues-to-be-aware-of

Fixations

Saccades


Eye Tracking data Gaze points Position of eye-gaze on the

screen

Fixations A long stay of the gaze on a particular object on the screen

Fixations have both Spatial (coordinates) and Temporal (duration) properties

Saccade A very rapid movement of eye between the positions of rest

Scanpath A path connecting a series of fixations

Regression Revisiting a previously read segment


Controlling the experimental setup for eye-tracking

Eye movement patterns influenced by factors like age working proficiency environmental distractions etc

Guidelines for eye tracking Participants metadata (age expertise occupation)

etc

Performing a fresh calibration before each new experiment

Minimizing the head movement

Introduce adequate line spacing in the text and avoid scrolling

Carrying out the experiments in a relatively low light environment


Use of eye tracking

Used extensively in Psychology

Mainly to study reading processes

Seminal work Just MA and Carpenter PA (1980) A theory of reading from eye fixations to comprehension Psychological Review 87(4)329ndash354

Used in flight simulators for pilot training


NLP and Eye Tracking research

Kliegl (2011)- Predict word frequency

and pattern from eye movements Doherty et al (2010)- Eye-tracking as an

automatic Machine Translation Evaluation Technique

Stymne et al (2012)- Eye-tracking as a tool for Machine Translation (MT) error analysis

Dragsted (2010)- Co-ordination of reading and writing process during translation

Relatively new and open research direction


Translation Difficulty Index (TDI)

Motivation route sentences to translators with right competence as per difficulty of translating

On a crowdsourcing platform eg

TDI is a function of

sentence length (l)

degree of polysemy of constituent words (p) and

structural complexity (s)


Contributor to TDI length

What is more difficult to translate

John eats jam vs

John eats jam made from apples vs

John eats jam made from apples grown in orchards vs

John eats bread made from apples grown in orchards on black soil


Contributor to TDI polysemy


John is in a jam vs

John is in difficulty

Jam has 4 diverse senses difficulty has 4 related senses


Contributor to TDI structural complexity


John is in a jam His debt is huge The lenders cause him to shy from them every moment he sees them vs

John is in a jam caused by his huge debt which forces him to shy from his lenders every moment he sees them


Measuring translation through Gaze data

Translation difficulty indicated by

staying of eye on segments

Jumping back and forth between segments

Example

The horse raced past the garden fell


Measuring translation difficulty through Gaze data




Example


बगीचा क पास स दौडाया गया घोड़ा गगर गया bagiichaa ke pas se doudaayaa gayaa ghodaa

gir gayaa

The translation process will complete the task till garden and then backtrack revise restart and translate in a different way 109 Jan 6 2014 ISI Kolkata ML for MT

Scanpaths indicator of translation difficulty

(Malsburg et al 2007)

Sentence 2 is a clear case of ldquoGarden pathingrdquo

which imposes cognitive load on participants and the prefer syntactic re-analysis


Translog A tool for recording Translation Process Data Translog (Carl 2012) A Windows based program

Built with a purpose of recording gaze and key-stroke data during translation

Can be used for other reading and writing related studies

Using Translog one can Create and Customize translationreading and writing

experiments involving eye-tracking and keystroke logging

Calibrate the eye-tracker

Replay and analyze the recorded log files

Manually correct errors in gaze recording


TPR Database The Translation Process Research (TPR) database

(Carl 2012) is a database containing behavioral data for translation activities

Contains Gaze and Keystroke information for more than 450 experiments

40 different paragraphs are translated into 7 different languages from English by multiple translators

At least 5 translators per language

Source and target paragraphs are annotated with POS tags lemmas dependency relations etc

Easy to use XML data format


Experimental setup (12)

Translators translate sentence by sentence typing to a text box

The display screen is attached with a remote eye-tracker which

constantly records the eye movement of the translator



Extracted 20 different text categories from the data

Each piece of text contains 5-10 sentences

For each category we had at least 10 participants who translated the text into different target languages


A predictive framework for TDI

Direct annotation of TDI is fraught with subjectivity and ad-hocism

We use translator‟s gaze data as annotation to prepare training data

Training data

Regressor

Labeling through gaze analysis Features

Test Data

TDI


Annotation of TDI (14)

First approximation -gt TDI equivalent to ldquotime taken to translaterdquo

However time taken to translate may not be strongly related to translation difficulty

It is difficult to know what fraction of the total time is spent on translation related thinking

Sensitive to distractions from the environment



Instead of the ldquotime taken to translaterdquo consider ldquotime for which translation related processing is carried out by the brainrdquo

This is called Translation Processing Time given by

119879119901 = 119879119888119900119898119901+119879119892119890119899

Tcomp and Tgen are the comprehension of source text comprehension and target text generation respectively



Humans spend time on what they see and this ldquotimerdquo is correlated with the complexity of the information being processed

f- fixation s- saccade Fs- source Ft- target

119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905



The measured TDI score is the Tp normalized over sentence length

119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features

Length Word count of the sentences

Degree of Polysemy Sum of number of senses of each word in the WordNet normalized by length

Structural Complexity If the attachment units lie far from each other the sentence has higher structural complexity Lin (1996) defines it as the total length of dependency links in the dependency structure of the sentence

Measured TDI for TPR database for 80 sentences


Experiment and results

Training data of 80 examples 10-fold cross validation

Features computed using Princeton WordNet and Stanford Dependency Parser

Support Vector Regression technique (Joachims et al 1999) along with different kernels

Error analysis was done by Mean Squared Error estimate

We also computed the correlation of the predicted TDI with the measured TDI


Examples from the dataset


Summary

Covered Interlingual based MT the oldest approach to MT

Covered SMT the newest approach to MT

Presented some recent study in the context of Indian Languages

Covered eye tracking bring cognitive aspects to MT


Summary

SMT is the ruling paradigm

But linguistic features can enhance performance especially the factored based SMT with factors coming from interlingua

Large scale effort sponsored by ministry of IT TDIL program to create MT systems

Parallel corpora creation is also going on in a consortium mode


Conclusions NLP has assumed great importance because

of large amount of text in e-form

Machine learning techniques are increasingly applied

Highly relevant for India where multilinguality is way of life

Machine Translation is more fundamental and ubiquitous than just mapping between two languages

Utterancethought

Speech to speech online translation Jan 6 2014 ISI Kolkata ML for MT 125

Pubs httpwwcseiitbacin~pb

Resources and tools

httpwwwcfiltiitbacin


NLP Trinity

Algorithm

Problem

Language Hindi

Marathi

English

French Morph Analysis

Part of Speech Tagging

Parsing

Semantics

CRF

HMM

MEMM

NLP Trinity


NLP Layer

Morphology

POS tagging

Chunking

Parsing





Motivation for MT

MT NLP Complete

NLP AI complete

AI CS complete



Solution automation



MT Approaches




Statistical MT



Why is MT difficult

Language divergence












Analysis

ist + sit verb stem

















Minister)














Vauquois Triangle





S

NP VP

V N NP

N John eats

bread

S

NP VP

V N

John eats

NP

N

bread

(transfer svo sov)






language













past


emphasis

ltis-a gt action

ltis-a gt place


ltis-a gt person

goal



N Name

P POS

A Attachment

S Sense

C Co-reference

R Semantic Role


Issues to handle




Issues to handle



NER



Issues to handle



NER

WSD Financial

bank or River bank


Issues to handle



NER

WSD

Co-reference



Issues to handle



NER

WSD

Co-reference

Subject Drop




POS tagger




Lexical Resource

WordNet




Dependency Parser

XLE Parser

Feature Generation


Relation Generation


NER


WSD

Clause Marker

Merger

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simplifier



Lexical Transfer

Target Sentence

Generation

Syntax

Planning

Morphological

Synthesis

(WordPhrase






Transfer Based MT

Marathi-Hindi





Approach

Transfer based





Source Text

POS Tagger

Chunker




n

Lexical Transfer

Agreement Feature

Interchunk

Word Generator

Intrachunk

Target Text

Analysis

Transfer

Generation







Accuracy =



minor errors


major errors





0

02

04

06

08

1

12

MorphAnalyzer


WSD LexicalTransfer

WordGenerator

Precision

Recall






Result analysis






Czeck-English data








To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools



NLP Layer

Morphology

POS tagging

Chunking

Parsing





Motivation for MT

MT NLP Complete

NLP AI complete

AI CS complete



Solution automation



MT Approaches




Statistical MT



Why is MT difficult

Language divergence












Analysis

ist + sit verb stem

















Minister)














Vauquois Triangle





S

NP VP

V N NP

N John eats

bread

S

NP VP

V N

John eats

NP

N

bread

(transfer svo sov)






language













past


emphasis

ltis-a gt action

ltis-a gt place


ltis-a gt person

goal



N Name

P POS

A Attachment

S Sense

C Co-reference

R Semantic Role


Issues to handle




Issues to handle



NER



Issues to handle



NER

WSD Financial

bank or River bank


Issues to handle



NER

WSD

Co-reference



Issues to handle



NER

WSD

Co-reference

Subject Drop




POS tagger




Lexical Resource

WordNet




Dependency Parser

XLE Parser

Feature Generation


Relation Generation


NER


WSD

Clause Marker

Merger

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simplifier



Lexical Transfer

Target Sentence

Generation

Syntax

Planning

Morphological

Synthesis

(WordPhrase






Transfer Based MT

Marathi-Hindi





Approach

Transfer based





Source Text

POS Tagger

Chunker




n

Lexical Transfer

Agreement Feature

Interchunk

Word Generator

Intrachunk

Target Text

Analysis

Transfer

Generation







Accuracy =



minor errors


major errors





0

02

04

06

08

1

12

MorphAnalyzer


WSD LexicalTransfer

WordGenerator

Precision

Recall






Result analysis






Czeck-English data








To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools



Motivation for MT

MT NLP Complete

NLP AI complete

AI CS complete



Solution automation



MT Approaches




Statistical MT



Why is MT difficult

Language divergence












Analysis

ist + sit verb stem

















Minister)














Vauquois Triangle





S

NP VP

V N NP

N John eats

bread

S

NP VP

V N

John eats

NP

N

bread

(transfer svo sov)






language













past


emphasis

ltis-a gt action

ltis-a gt place


ltis-a gt person

goal



N Name

P POS

A Attachment

S Sense

C Co-reference

R Semantic Role


Issues to handle




Issues to handle



NER



Issues to handle



NER

WSD Financial

bank or River bank


Issues to handle



NER

WSD

Co-reference



Issues to handle



NER

WSD

Co-reference

Subject Drop




POS tagger




Lexical Resource

WordNet




Dependency Parser

XLE Parser

Feature Generation


Relation Generation


NER


WSD

Clause Marker

Merger

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simplifier



Lexical Transfer

Target Sentence

Generation

Syntax

Planning

Morphological

Synthesis

(WordPhrase






Transfer Based MT

Marathi-Hindi





Approach

Transfer based





Source Text

POS Tagger

Chunker




n

Lexical Transfer

Agreement Feature

Interchunk

Word Generator

Intrachunk

Target Text

Analysis

Transfer

Generation







Accuracy =



minor errors


major errors





0

02

04

06

08

1

12

MorphAnalyzer


WSD LexicalTransfer

WordGenerator

Precision

Recall






Result analysis






Czeck-English data








To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools




MT Approaches




Statistical MT



Why is MT difficult

Language divergence












Analysis

ist + sit verb stem

















Minister)














Vauquois Triangle





S

NP VP

V N NP

N John eats

bread

S

NP VP

V N

John eats

NP

N

bread

(transfer svo sov)






language













past


emphasis

ltis-a gt action

ltis-a gt place


ltis-a gt person

goal



N Name

P POS

A Attachment

S Sense

C Co-reference

R Semantic Role


Issues to handle




Issues to handle



NER



Issues to handle



NER

WSD Financial

bank or River bank


Issues to handle



NER

WSD

Co-reference



Issues to handle



NER

WSD

Co-reference

Subject Drop




POS tagger




Lexical Resource

WordNet




Dependency Parser

XLE Parser

Feature Generation


Relation Generation


NER


WSD

Clause Marker

Merger

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simplifier



Lexical Transfer

Target Sentence

Generation

Syntax

Planning

Morphological

Synthesis

(WordPhrase






Transfer Based MT

Marathi-Hindi





Approach

Transfer based





Source Text

POS Tagger

Chunker




n

Lexical Transfer

Agreement Feature

Interchunk

Word Generator

Intrachunk

Target Text

Analysis

Transfer

Generation







Accuracy =



minor errors


major errors





0

02

04

06

08

1

12

MorphAnalyzer


WSD LexicalTransfer

WordGenerator

Precision

Recall






Result analysis






Czeck-English data








To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools



Why is MT difficult

Language divergence












Analysis

ist + sit verb stem

















Minister)














Vauquois Triangle





S

NP VP

V N NP

N John eats

bread

S

NP VP

V N

John eats

NP

N

bread

(transfer svo sov)






language













past


emphasis

ltis-a gt action

ltis-a gt place


ltis-a gt person

goal



N Name

P POS

A Attachment

S Sense

C Co-reference

R Semantic Role


Issues to handle




Issues to handle



NER



Issues to handle



NER

WSD Financial

bank or River bank


Issues to handle



NER

WSD

Co-reference



Issues to handle



NER

WSD

Co-reference

Subject Drop




POS tagger




Lexical Resource

WordNet




Dependency Parser

XLE Parser

Feature Generation


Relation Generation


NER


WSD

Clause Marker

Merger

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simplifier



Lexical Transfer

Target Sentence

Generation

Syntax

Planning

Morphological

Synthesis

(WordPhrase






Transfer Based MT

Marathi-Hindi





Approach

Transfer based





Source Text

POS Tagger

Chunker




n

Lexical Transfer

Agreement Feature

Interchunk

Word Generator

Intrachunk

Target Text

Analysis

Transfer

Generation







Accuracy =



minor errors


major errors





0

02

04

06

08

1

12

MorphAnalyzer


WSD LexicalTransfer

WordGenerator

Precision

Recall






Result analysis






Czeck-English data








To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools













Analysis

ist + sit verb stem

















Minister)














Vauquois Triangle





S

NP VP

V N NP

N John eats

bread

S

NP VP

V N

John eats

NP

N

bread

(transfer svo sov)






language













past


emphasis

ltis-a gt action

ltis-a gt place


ltis-a gt person

goal



N Name

P POS

A Attachment

S Sense

C Co-reference

R Semantic Role


Issues to handle




Issues to handle



NER



Issues to handle



NER

WSD Financial

bank or River bank


Issues to handle



NER

WSD

Co-reference



Issues to handle



NER

WSD

Co-reference

Subject Drop




POS tagger




Lexical Resource

WordNet




Dependency Parser

XLE Parser

Feature Generation


Relation Generation


NER


WSD

Clause Marker

Merger

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simplifier



Lexical Transfer

Target Sentence

Generation

Syntax

Planning

Morphological

Synthesis

(WordPhrase






Transfer Based MT

Marathi-Hindi





Approach

Transfer based





Source Text

POS Tagger

Chunker




n

Lexical Transfer

Agreement Feature

Interchunk

Word Generator

Intrachunk

Target Text

Analysis

Transfer

Generation







Accuracy =



minor errors


major errors





0

02

04

06

08

1

12

MorphAnalyzer


WSD LexicalTransfer

WordGenerator

Precision

Recall






Result analysis






Czeck-English data








To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools













Minister)














Vauquois Triangle





S

NP VP

V N NP

N John eats

bread

S

NP VP

V N

John eats

NP

N

bread

(transfer svo sov)






language













past


emphasis

ltis-a gt action

ltis-a gt place


ltis-a gt person

goal



N Name

P POS

A Attachment

S Sense

C Co-reference

R Semantic Role


Issues to handle




Issues to handle



NER



Issues to handle



NER

WSD Financial

bank or River bank


Issues to handle



NER

WSD

Co-reference



Issues to handle



NER

WSD

Co-reference

Subject Drop




POS tagger




Lexical Resource

WordNet




Dependency Parser

XLE Parser

Feature Generation


Relation Generation


NER


WSD

Clause Marker

Merger

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simplifier



Lexical Transfer

Target Sentence

Generation

Syntax

Planning

Morphological

Synthesis

(WordPhrase






Transfer Based MT

Marathi-Hindi





Approach

Transfer based





Source Text

POS Tagger

Chunker




n

Lexical Transfer

Agreement Feature

Interchunk

Word Generator

Intrachunk

Target Text

Analysis

Transfer

Generation







Accuracy =



minor errors


major errors





0

02

04

06

08

1

12

MorphAnalyzer


WSD LexicalTransfer

WordGenerator

Precision

Recall






Result analysis






Czeck-English data








To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools















Vauquois Triangle





S

NP VP

V N NP

N John eats

bread

S

NP VP

V N

John eats

NP

N

bread

(transfer svo sov)






language













past


emphasis

ltis-a gt action

ltis-a gt place


ltis-a gt person

goal



N Name

P POS

A Attachment

S Sense

C Co-reference

R Semantic Role


Issues to handle




Issues to handle



NER



Issues to handle



NER

WSD Financial

bank or River bank


Issues to handle



NER

WSD

Co-reference



Issues to handle



NER

WSD

Co-reference

Subject Drop




POS tagger




Lexical Resource

WordNet




Dependency Parser

XLE Parser

Feature Generation


Relation Generation


NER


WSD

Clause Marker

Merger

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simplifier



Lexical Transfer

Target Sentence

Generation

Syntax

Planning

Morphological

Synthesis

(WordPhrase






Transfer Based MT

Marathi-Hindi





Approach

Transfer based





Source Text

POS Tagger

Chunker




n

Lexical Transfer

Agreement Feature

Interchunk

Word Generator

Intrachunk

Target Text

Analysis

Transfer

Generation







Accuracy =



minor errors


major errors





0

02

04

06

08

1

12

MorphAnalyzer


WSD LexicalTransfer

WordGenerator

Precision

Recall






Result analysis






Czeck-English data








To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools






S

NP VP

V N NP

N John eats

bread

S

NP VP

V N

John eats

NP

N

bread

(transfer svo sov)






language













past


emphasis

ltis-a gt action

ltis-a gt place


ltis-a gt person

goal



N Name

P POS

A Attachment

S Sense

C Co-reference

R Semantic Role


Issues to handle




Issues to handle



NER



Issues to handle



NER

WSD Financial

bank or River bank


Issues to handle



NER

WSD

Co-reference



Issues to handle



NER

WSD

Co-reference

Subject Drop




POS tagger




Lexical Resource

WordNet




Dependency Parser

XLE Parser

Feature Generation


Relation Generation


NER


WSD

Clause Marker

Merger

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simplifier



Lexical Transfer

Target Sentence

Generation

Syntax

Planning

Morphological

Synthesis

(WordPhrase






Transfer Based MT

Marathi-Hindi





Approach

Transfer based





Source Text

POS Tagger

Chunker




n

Lexical Transfer

Agreement Feature

Interchunk

Word Generator

Intrachunk

Target Text

Analysis

Transfer

Generation







Accuracy =



minor errors


major errors





0

02

04

06

08

1

12

MorphAnalyzer


WSD LexicalTransfer

WordGenerator

Precision

Recall






Result analysis






Czeck-English data








To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools







language













past


emphasis

ltis-a gt action

ltis-a gt place


ltis-a gt person

goal



N Name

P POS

A Attachment

S Sense

C Co-reference

R Semantic Role


Issues to handle




Issues to handle



NER



Issues to handle



NER

WSD Financial

bank or River bank


Issues to handle



NER

WSD

Co-reference



Issues to handle



NER

WSD

Co-reference

Subject Drop




POS tagger




Lexical Resource

WordNet




Dependency Parser

XLE Parser

Feature Generation


Relation Generation


NER


WSD

Clause Marker

Merger

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simplifier



Lexical Transfer

Target Sentence

Generation

Syntax

Planning

Morphological

Synthesis

(WordPhrase






Transfer Based MT

Marathi-Hindi





Approach

Transfer based





Source Text

POS Tagger

Chunker




n

Lexical Transfer

Agreement Feature

Interchunk

Word Generator

Intrachunk

Target Text

Analysis

Transfer

Generation







Accuracy =



minor errors


major errors





0

02

04

06

08

1

12

MorphAnalyzer


WSD LexicalTransfer

WordGenerator

Precision

Recall






Result analysis






Czeck-English data








To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools














past


emphasis

ltis-a gt action

ltis-a gt place


ltis-a gt person

goal



N Name

P POS

A Attachment

S Sense

C Co-reference

R Semantic Role


Issues to handle




Issues to handle



NER



Issues to handle



NER

WSD Financial

bank or River bank


Issues to handle



NER

WSD

Co-reference



Issues to handle



NER

WSD

Co-reference

Subject Drop




POS tagger




Lexical Resource

WordNet




Dependency Parser

XLE Parser

Feature Generation


Relation Generation


NER


WSD

Clause Marker

Merger

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simplifier



Lexical Transfer

Target Sentence

Generation

Syntax

Planning

Morphological

Synthesis

(WordPhrase






Transfer Based MT

Marathi-Hindi





Approach

Transfer based





Source Text

POS Tagger

Chunker




n

Lexical Transfer

Agreement Feature

Interchunk

Word Generator

Intrachunk

Target Text

Analysis

Transfer

Generation







Accuracy =



minor errors


major errors





0

02

04

06

08

1

12

MorphAnalyzer


WSD LexicalTransfer

WordGenerator

Precision

Recall






Result analysis






Czeck-English data








To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools




N Name

P POS

A Attachment

S Sense

C Co-reference

R Semantic Role


Issues to handle




Issues to handle



NER



Issues to handle



NER

WSD Financial

bank or River bank


Issues to handle



NER

WSD

Co-reference



Issues to handle



NER

WSD

Co-reference

Subject Drop




POS tagger




Lexical Resource

WordNet




Dependency Parser

XLE Parser

Feature Generation


Relation Generation


NER


WSD

Clause Marker

Merger

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simplifier



Lexical Transfer

Target Sentence

Generation

Syntax

Planning

Morphological

Synthesis

(WordPhrase






Transfer Based MT

Marathi-Hindi





Approach

Transfer based





Source Text

POS Tagger

Chunker




n

Lexical Transfer

Agreement Feature

Interchunk

Word Generator

Intrachunk

Target Text

Analysis

Transfer

Generation







Accuracy =



minor errors


major errors





0

02

04

06

08

1

12

MorphAnalyzer


WSD LexicalTransfer

WordGenerator

Precision

Recall






Result analysis






Czeck-English data








To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools



Issues to handle




Issues to handle



NER



Issues to handle



NER

WSD Financial

bank or River bank


Issues to handle



NER

WSD

Co-reference



Issues to handle



NER

WSD

Co-reference

Subject Drop




POS tagger




Lexical Resource

WordNet




Dependency Parser

XLE Parser

Feature Generation


Relation Generation


NER


WSD

Clause Marker

Merger

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simplifier



Lexical Transfer

Target Sentence

Generation

Syntax

Planning

Morphological

Synthesis

(WordPhrase






Transfer Based MT

Marathi-Hindi





Approach

Transfer based





Source Text

POS Tagger

Chunker




n

Lexical Transfer

Agreement Feature

Interchunk

Word Generator

Intrachunk

Target Text

Analysis

Transfer

Generation







Accuracy =



minor errors


major errors





0

02

04

06

08

1

12

MorphAnalyzer


WSD LexicalTransfer

WordGenerator

Precision

Recall






Result analysis






Czeck-English data








To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools



Issues to handle



NER



Issues to handle



NER

WSD Financial

bank or River bank


Issues to handle



NER

WSD

Co-reference



Issues to handle



NER

WSD

Co-reference

Subject Drop




POS tagger




Lexical Resource

WordNet




Dependency Parser

XLE Parser

Feature Generation


Relation Generation


NER


WSD

Clause Marker

Merger

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simplifier



Lexical Transfer

Target Sentence

Generation

Syntax

Planning

Morphological

Synthesis

(WordPhrase






Transfer Based MT

Marathi-Hindi





Approach

Transfer based





Source Text

POS Tagger

Chunker




n

Lexical Transfer

Agreement Feature

Interchunk

Word Generator

Intrachunk

Target Text

Analysis

Transfer

Generation







Accuracy =



minor errors


major errors





0

02

04

06

08

1

12

MorphAnalyzer


WSD LexicalTransfer

WordGenerator

Precision

Recall






Result analysis






Czeck-English data








To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools



Issues to handle



NER

WSD Financial

bank or River bank


Issues to handle



NER

WSD

Co-reference



Issues to handle



NER

WSD

Co-reference

Subject Drop




POS tagger




Lexical Resource

WordNet




Dependency Parser

XLE Parser

Feature Generation


Relation Generation


NER


WSD

Clause Marker

Merger

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simplifier



Lexical Transfer

Target Sentence

Generation

Syntax

Planning

Morphological

Synthesis

(WordPhrase






Transfer Based MT

Marathi-Hindi





Approach

Transfer based





Source Text

POS Tagger

Chunker




n

Lexical Transfer

Agreement Feature

Interchunk

Word Generator

Intrachunk

Target Text

Analysis

Transfer

Generation







Accuracy =



minor errors


major errors





0

02

04

06

08

1

12

MorphAnalyzer


WSD LexicalTransfer

WordGenerator

Precision

Recall






Result analysis






Czeck-English data








To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools



Issues to handle



NER

WSD

Co-reference



Issues to handle



NER

WSD

Co-reference

Subject Drop




POS tagger




Lexical Resource

WordNet




Dependency Parser

XLE Parser

Feature Generation


Relation Generation


NER


WSD

Clause Marker

Merger

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simplifier



Lexical Transfer

Target Sentence

Generation

Syntax

Planning

Morphological

Synthesis

(WordPhrase






Transfer Based MT

Marathi-Hindi





Approach

Transfer based





Source Text

POS Tagger

Chunker




n

Lexical Transfer

Agreement Feature

Interchunk

Word Generator

Intrachunk

Target Text

Analysis

Transfer

Generation







Accuracy =



minor errors


major errors





0

02

04

06

08

1

12

MorphAnalyzer


WSD LexicalTransfer

WordGenerator

Precision

Recall






Result analysis






Czeck-English data








To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools



Issues to handle



NER

WSD

Co-reference

Subject Drop




POS tagger




Lexical Resource

WordNet




Dependency Parser

XLE Parser

Feature Generation


Relation Generation


NER


WSD

Clause Marker

Merger

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simplifier



Lexical Transfer

Target Sentence

Generation

Syntax

Planning

Morphological

Synthesis

(WordPhrase






Transfer Based MT

Marathi-Hindi





Approach

Transfer based





Source Text

POS Tagger

Chunker




n

Lexical Transfer

Agreement Feature

Interchunk

Word Generator

Intrachunk

Target Text

Analysis

Transfer

Generation







Accuracy =



minor errors


major errors





0

02

04

06

08

1

12

MorphAnalyzer


WSD LexicalTransfer

WordGenerator

Precision

Recall






Result analysis






Czeck-English data








To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools




POS tagger




Lexical Resource

WordNet




Dependency Parser

XLE Parser

Feature Generation


Relation Generation


NER


WSD

Clause Marker

Merger

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simplifier



Lexical Transfer

Target Sentence

Generation

Syntax

Planning

Morphological

Synthesis

(WordPhrase






Transfer Based MT

Marathi-Hindi





Approach

Transfer based





Source Text

POS Tagger

Chunker




n

Lexical Transfer

Agreement Feature

Interchunk

Word Generator

Intrachunk

Target Text

Analysis

Transfer

Generation







Accuracy =



minor errors


major errors





0

02

04

06

08

1

12

MorphAnalyzer


WSD LexicalTransfer

WordGenerator

Precision

Recall






Result analysis






Czeck-English data








To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools




Dependency Parser

XLE Parser

Feature Generation


Relation Generation


NER


WSD

Clause Marker

Merger

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simple Enco

Simplifier



Lexical Transfer

Target Sentence

Generation

Syntax

Planning

Morphological

Synthesis

(WordPhrase






Transfer Based MT

Marathi-Hindi





Approach

Transfer based





Source Text

POS Tagger

Chunker




n

Lexical Transfer

Agreement Feature

Interchunk

Word Generator

Intrachunk

Target Text

Analysis

Transfer

Generation







Accuracy =



minor errors


major errors





0

02

04

06

08

1

12

MorphAnalyzer


WSD LexicalTransfer

WordGenerator

Precision

Recall






Result analysis






Czeck-English data








To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools




Lexical Transfer

Target Sentence

Generation

Syntax

Planning

Morphological

Synthesis

(WordPhrase






Transfer Based MT

Marathi-Hindi





Approach

Transfer based





Source Text

POS Tagger

Chunker




n

Lexical Transfer

Agreement Feature

Interchunk

Word Generator

Intrachunk

Target Text

Analysis

Transfer

Generation







Accuracy =



minor errors


major errors





0

02

04

06

08

1

12

MorphAnalyzer


WSD LexicalTransfer

WordGenerator

Precision

Recall






Result analysis






Czeck-English data








To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools





Transfer Based MT

Marathi-Hindi





Approach

Transfer based





Source Text

POS Tagger

Chunker




n

Lexical Transfer

Agreement Feature

Interchunk

Word Generator

Intrachunk

Target Text

Analysis

Transfer

Generation







Accuracy =



minor errors


major errors





0

02

04

06

08

1

12

MorphAnalyzer


WSD LexicalTransfer

WordGenerator

Precision

Recall






Result analysis






Czeck-English data








To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools






Approach

Transfer based





Source Text

POS Tagger

Chunker




n

Lexical Transfer

Agreement Feature

Interchunk

Word Generator

Intrachunk

Target Text

Analysis

Transfer

Generation







Accuracy =



minor errors


major errors





0

02

04

06

08

1

12

MorphAnalyzer


WSD LexicalTransfer

WordGenerator

Precision

Recall






Result analysis






Czeck-English data








To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools





Source Text

POS Tagger

Chunker




n

Lexical Transfer

Agreement Feature

Interchunk

Word Generator

Intrachunk

Target Text

Analysis

Transfer

Generation







Accuracy =



minor errors


major errors





0

02

04

06

08

1

12

MorphAnalyzer


WSD LexicalTransfer

WordGenerator

Precision

Recall






Result analysis






Czeck-English data








To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools








Accuracy =



minor errors


major errors





0

02

04

06

08

1

12

MorphAnalyzer


WSD LexicalTransfer

WordGenerator

Precision

Recall






Result analysis






Czeck-English data








To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools




0

02

04

06

08

1

12

MorphAnalyzer


WSD LexicalTransfer

WordGenerator

Precision

Recall






Result analysis






Czeck-English data








To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools






Result analysis






Czeck-English data








To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools





Czeck-English data








To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools



To translate hellip

I will carry

They drive

He swims

They will drive


Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools



Hindi-English data








Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools



Bangla-English data









I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools




I will carry

They drive

He swims

They will drive


Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools



Foundation






SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools



SMT Language Model












Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools








Sentence level

Word level


Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools



Alignment










Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools






Ram went to school










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools










a

eafef )|Pr()|Pr(

m

emafeaf )|Pr()|Pr(

m


m

emafem )|Pr()|Pr(

m

m

j

jj

jj emfaafem1

1

1

1

1 )|Pr()|Pr(

m

j

jj

j

jj

j

m

emfafemfaaem1

1

11

1

1

1

1 )|Pr()|Pr()|Pr(

)|Pr( emaf )|Pr( em

m

j

jj

j

jj

j emfafemfaa1

1

11

1

1

1

1 )|Pr()|Pr(


expression


marginalization

marginalization


Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools



Alignment



Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools




Spell checking

Translation

Transliteration

Speech to text

Text to speeh




English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools





English

(1) three rabbits

a b


b c d

French

(1) trois lapins

w x


x y z




a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools





a b c d

w 14 14 14 14

x 14 14 14 14

y 14 14 14 14

z 14 14 14 14




c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools






c (w f |w e f e ) =t (w f |w e )

t (w f |we



with w e in f and e






C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools




C[aw (a b)(w x)]

t(aw)


t(aw)+t(ax)

14

= ----------------- X 1 X 1= 12

14+14



x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools




x y z

a b c d

w 0 0 0 0

x 0 13 13 13

y 0 13 13 13

z 0 13 13 13

a b

w x

a b c d

w 12 12 0 0

x 12 12 0 0

y 0 0 0 0

z 0 0 0 0



trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools




trevised(a w)

12

= -------------------------------------------------------------------

(12+12 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)



a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools




a b c d

w 12 14 0 0

x 12 512 13 13

y 0 16 13 13

z 0 16 13 13


x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools




x y z

a b c d

w 0 0 0 0

x 0 59 13 13

y 0 29 13 13

z 0 29 13 13

a b

w x

a b c d

w 12 38 0 0

x 12 58 0 0

y 0 0 0 0

z 0 0 0 0



w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools




w 12 316 0 0

x 12 85144 13 13

y 0 19 13 13

z 0 19 13 13


b=rabbits x=lapins




2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools






2

1

LV

LV

F

E



E1

F1

E2

F2


Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools



Vocabulary mapping

Vocabulary VE VF





11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools




11989011 11989012 hellip 119890

11198971 119891

11 11989112 hellip 119891

11198981

11989021 11989022 hellip 119890

21198972 119891

21 11989122 hellip 119891

21198982

1198901199041 119890

1199042 hellip 119890

119904119897119904 119891

1199041 1198911199042 hellip 119891

119904119898119904

1198901198781 119890

1198782 hellip 119890

119878119897119878 119891

1198781 1198911198782 hellip 119891

119878119898119878










word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools







word 119911119901119902119904 = 0 otherwise









119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools








119875119894119895

119881119865

119895=1

= 1 forall119894




Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools





Final E and M steps

M-step

E-step




Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools





Example













OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools














OR



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools



Thereforehellip
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools
































Markers+inflections

I ate mangoes



mei_ne aam khaa_yaa


Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools



Our Approach







Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools



Our Factorization


Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools



Experiments


Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools



Corpus Statistics
















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools

















A recent study

PAN Indian SMT








SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools









SMT Systems Trained


































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools































Scenario

Sentences

John ate jam


John is in a jam


Easy

Moderate

Difficult




Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools





Specifically





Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools




Fixations

Saccades



screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools




screen










etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools






etc






Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools



Use of eye tracking


















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools















sentence length (l)






John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools





John eats jam vs







John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools





John is in a jam vs













Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools












Example







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools







Example



gir gayaa





































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools






































Training data

Regressor


Test Data

TDI











119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools












119879119901 = 119879119888119900119898119901+119879119892119890119899






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools






119879119901 = 119889119906119903 119891 +

119891 isin 119865119904

119889119906119903 119904 +

119904 isin 119878119904

119889119906119903 119891 +

119891 isin 119865119905

119889119906119903 119904

119904 isin 119878119905




119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools





119879119863119868119898119890119886119904119906119903119890119889 =119879119901

119904119890119899119905119890119899119888119890_119897119890119899119892119905119893


Features















Summary






Summary











Utterancethought



Resources and tools



Features















Summary






Summary











Utterancethought



Resources and tools












Summary






Summary











Utterancethought



Resources and tools



Summary











Utterancethought



Resources and tools








Utterancethought



Resources and tools




Resources and tools



Download - Machine Learning for Machine Translationacmsc/TMW2014/P_bhattacharyya.pdf · Machine Learning for Machine Translation Pushpak Bhattacharyya CSE Dept., IIT Bombay ... Target Text Analysis

Top Related