february 2006machine translation ii.21 postgraduate diploma in translation example based machine...

49
February 2006 Machine Translation II.2 1 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

Upload: erica-french

Post on 05-Jan-2016

279 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 1

Postgraduate DiplomaIn Translation

Example Based Machine Translation Statistical Machine Translation

Page 2: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 2

Three ways to lighten the load Restrict coverage to specialised domains Exploit existing sources of knowledge

(convert machine readable dictionaries) Try to manage without explicit

representations Example Based MT (EBMT) Statistical MT (SMT)

Page 3: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 3

Today’s Lecture

Example Based MT Statistical MT

Page 4: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 4

Part I

Example Based Machine Translation

Page 5: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 5

EBMT

Basic idea is that instead of being based on on rules and abstract representations, translation should be based on a database of examples.

Each example is pairing of a source/target fragment.

The original intuition came from Nagao, a well-known pioneer in the field of En/Jp translation.

Page 6: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 6

EBMT (Nagao 1984)

Man does translation by: by properly decomposing an input sentence

into certain fragmental phrases, then by translating these phrases into other

language phrases, and finally by properly composing these fragmental

translations into one long sentence.

Page 7: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 7

Three Step Process

Match: identify relevant source language examples in database.

Align: find corresponding fragments in target language.

Recombine: target language fragments to form sentences.

Page 8: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 8

EBMT

An Example Based Machine Translation

as used in the Pangloss system at Carnegie Mellon University

Based on Notes by Dave Inman

Page 9: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 9

EBMT Corpus & Index

CorpusS1: The cat eats a fish. Le chat mange un poisson.

S2: A dog eats a cat. Un chien mange un chat.

S99,999,999 ….

Index

the: S1

cat: S1,S2

eats: S1,S2

fish: S1

dog: S2

Page 10: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 10

EBMT: find chunks

A source language sentence is input.The cat eats a dog.

Chunks of this sentence are matched against the corpus.The cat: S1

The cat eats: S1

The cat eats a: S1

a dog: S2

Page 11: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 11

Match and Align Chunks

For each chunk retrieve target. the cat eats : S1

The cat eats a fish. Le chat mange un poisson a dog: S2

a dog. Un chien mange un chat. The chunks are aligned with target sentences

The cat eats Le chat mange un poisson

Alignment is difficult.

Page 12: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 12

Recombination

Chunks are scored to find good match… The cat eats/Le chat mange Score 78% The cat eats /Le chat dorme Score 43% a dog/un chien Score 67% a dog/le chien Score 56% a dog/un arbreScore 22%

The best translated chunks are put together to make the final translation. The cat eats/Le chat mange a dog/un chien

Page 13: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 13

What Data Are Needed?

1. A bilingual dictionary…but we can induce this from the corpus.

2. A target language root/synonym list.… so we can see similarity between words and inflected forms (e.g. verbs)

3. Classes of words easily translated… such as numbers, towns, weekdays.

4. A large corpus of parallel sentences.…if possible in the same domain as the translations.

Page 14: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 14

How to create a bilingual lexicon Take each sentence pair in the corpus. For each word in the source sentence, add

each word in the target sentence and increment the frequency count.

Repeat for as many sentences as possible. Use a threshold to get possible alternative

translations.

Page 15: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 15

How to create a lexiconThe cat eats a fish. Le chat mange un poisson.

the le,1 chat,1 mange,1 un,1 poisson,1

cat le,1 chat,1 mange,1 un,1 poisson,1

eats le,1 chat,1 mange,1 un,1 poisson,1

a le,1 chat,1 mange,1 un,1 poisson,1

fish le,1 chat,1 mange,1 un,1 poisson,1

Page 16: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 16

After many sentences …

the le,956la,925un,235------ Threshold ----------chat,47mange,33poisson,28....arbre,18

Page 17: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 17

After many sentences …

cat chat,963------ Threshold ----------le,604la,485un,305mange,33poisson,28....arbre,47

Page 18: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 18

Indexing the Corpus

For speed the corpus is indexed on the source language sentences.

Each word in each source language sentence is stored with info about the target sentence.

Words can be added to the corpus and the index easily updated.

Tokens are used for common classes of words (e.g. numbers). This makes matching more effective.

Page 19: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 19

Finding Chunks to Translate

Look up each word in the source sentence in the index.

Look for chunks in the source sentence (at least 2 words adjacent) which match the corpus.

Select last few matches against the corpus (translation memory).

Pangloss uses the last 5 matches for any chunk.

Page 20: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 20

Matching a chunk against the target. For each source chunk found previously, retrieve the

target sentences from the corpus (using the index).

Try to find the translation for the source chunk from these sentences.

This is the hard bit!

Look for the minimum and maximum segments in the target sentences which could correspond with the source chunk. Score each of these segments.

Page 21: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 21

Scoring a segment…

Unmatched Words : Higher priority is given to sentences containing all the words in an input chunk.

Noise : Higher priority is given to corpus sentences which have fewer extra words.

Order : Higher priority is given to sentences containing input words in the order which is closer to their order in the input chunk.

Morphology : Higher priority is given to sentences in which words match exactly rather than against morphological variants.

Page 22: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 22

Whole Sentence Match

If we are lucky the whole sentence will be found in the corpus!

In that case the target sentence is used without previous alignment.

Useful if translation memory is available (sentences recently translated are added to the corpus).

Page 23: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 23

Quality of Translation

Pangloss was tested against source sentences in a different domain to the examples in the corpus.

Pangloss “covered” about 70% of the sentences input.

This means a match was found against the corpus….

…but not necessarily a good match. Others report around 60% of the translation can be

understood by a native speaker. Systran manages about 70%.

Page 24: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 24

Speed of Translation

Translations are much faster than for Systran.

Simple sentences translated in seconds. Corpus can be added to (translation memory)

at about 6MBytes per minute (Sun Sparc Station)

A 270 Mbytes corpus takes 45 minutes to index.

Page 25: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 25

Positive Points

Fast Easy to add a new language pair No need to analyse languages (much) Can induce a dictionary from the corpus Allows easy implementation of translation

memory

Page 26: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 26

Negative Points

Quality is second best at present Depends on a large corpus of parallel, well

translated sentences 30% of source has no coverage (translation) Matching of words is brittle – we can see a

match Pangloss cannot. Domain of corpus should match domain to be

translated - to match chunks

Page 27: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 27

Conclusions

An alternative to Systran Faster Lower quality Quick to develop for a new language pair – if

corpus exists! Needs no linguistics Might improve as bigger corpora become

available?

Page 28: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 28

Part II

Statistical Translation

Page 29: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 29

Statistical Translation

Robust Domain independent Extensible Does not require language specialists Uses noisy channel model of translation

Page 30: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 30

Noisy Channel ModelSentence Translation (Brown et. al. 1990)

sourcesentence

target sentence

sentence

Page 31: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 31

Basic Principle

John loves Mary (S)

Jean aime Marie (T)

Given T, I have to find S such that Ptrans = probability that T is a translation of S

Ps = probability of S

Ptrans x Ps is greater than for any other S’

Page 32: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 32

A Statistical MT System

Source Language

Model

TranslationModel

Ps Ptrans

S T

DecoderT S

Page 33: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 33

The Three Components of a Statistical MT model1. Method for computing language model (Ps)

probabilities

2. Method for computing translation (Ptrans) probabilities

3. Method for searching amongst source sentences for one that maximises Ptrans x Ps

Page 34: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 34

Simplest Language Model

Probability Ps of any sentence is the product of the probabilities of the words in it.

For example, Probability ofJohn loves Mary= P(John) x P(loves) x P(Mary)

Page 35: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 35

Simplest Translation Model (1)

Assumption: target sentence is generated from the source sentence word-by-word

S: John loves Mary

T: Jean aime Marie

Page 36: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 36

Simplest Translation Model (2)

Ptrans is just the product of the translation probabilities of each of the words.

Ptrans =P(Jean|John) * P(aime|loves) * P(Marie|Mary)

Page 37: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 37

More Realistic Example

The proposal will not now be implemented

Les propositions ne seront pas mises en application maintenant

Page 38: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 38

More Realistic Translation Models Better translation models include other

features such as Fertility: the number of words in the target

that are paired with each source word: (0 – N)

Distortion: the difference in sentence position between the source word and the target word

Page 39: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 39

Searching

Maintain list of hypotheses. Initial hypothesis: (Jean aime Marie | *)

Search proceeds interatively. At each iteration we extend most promising hypotheses with additional wordsJean aime Marie | John(1) *Jean aime Marie | * loves(2) *Jean aime Marie | * Mary(3) *Jean aime Marie | Jean(1) *

Page 40: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 40

Building Models

In general - large quantities of data For language model, we need only source

language text. For translation model, we need pairs of

sentences that are translations of each other. Use EM Algorithm (Baum 1972) to optimize

model parameters.

Page 41: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 41

Experiment 1 (Brown et. al. 1990) Hansard. 40,000 pairs of sentences = approx.

800,000 words in each language. Considered 9,000 most common words in each

language. Assumptions (initial parameter values)

each of the 9000 target words equally likely as translations of each of the source words.

each of the fertilities from 0 to 25 equally likely for each of the 9000 source words

each target position equally likely given each source position and target length

Page 42: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 42

English: the

French Probability

le .610

la .178

l’ .083

les .023

ce .013

il .012

de .009

à .007

que .007

Fertility Probability

1 .871

0 .124

2 .004

Page 43: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 43

English: not

French Probability

pas .469

ne .460

non .024

pas du tout .003

faux .003

plus .002

ce .002

que .002

jamais .002

Fertility Probability

2 .758

0 .133

1 .106

Page 44: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 44

English: hear

French Probability

bravo .992

entendre .005

entendu .002

entends .001

Fertility Probability

0 .584

1 .416

Page 45: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 45

Experiment 2

Perform translation using 1000 most frequent words in the English corpus.

1,700 most frequently used French words in translations of sentences completely covered by 1000 word English vocabulary.

117,000 pairs of sentences completely covered by both vocabularies.

Parameters of English language model from 570,000 sentences in English part.

Page 46: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 46

Experiment 2 contd

73 French sentences tested from elsewhere in corpus. Results were classified as Exact – same as actual translation Alternate – same meaning Different – legitimate translation but different meaning Wrong – could not be intepreted as a translation Ungrammatical – grammatically deficient

Corrections to the last three categories were made and keystrokes were counted

Page 47: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 47

Results

Category # sentences percent

Exact 4 5

Alternate 18 25

Different 13 18

Wrong 11 15

Ungrammatical 27 37

Total 73

Page 48: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 48

Results - Discussion

According to Brown et. al., system performed successfully 48% of the time (first three categories).

776 keystrokes needed to repair 1916 keystrokes to generate all 73 translations from scratch.

According to authors, system therefore reduces work by 60%.

Page 49: February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation

February 2006 Machine Translation II.2 49

Bibliography

Statistical MTBrown et. al., A Statistical Approach to MT, Computational Linguistics 16.2, 1990 pp79-85 (search “ACL Anthology”)