Download - Semantic Addressable Encoding

Semantic Addressable Encoding

Cheng-Yuan Liou, Jau-Chi Huang, and Wen-Chie Yang

Department of Computer Science and Information EngineeringNational Taiwan University TC402, Oct. 5, ICONIP 2006, Hong Kong

Web red.csie.ntu.edu.tw

Sentence generating function The semantic world of Mark Twain Semantic Search under Shakespeare

Outline

Introduction Encoding Method

Elman network The word corpus – Elman’s idea Review semantic search Multidimensional Scaling (MDS) space Representative vector of a document Iterative re-encoding

Example Summary

Introduction

A central problem in semantic analysis is to effectively encoding and extracting the contents of word sequences.

Traditional way of creating a prime semantic space is extremely expensive and complex because experienced linguists are required to analyze huge number of words.

This paper represents an automatic encoding process.

Elman Network

Uoh: Lh x Lo weight matrix Uhi: Li x Lh weight matrix Uhc: Lc x Lh weight matrix

Ll = # neurons in output Lh = # neurons in hidden Li = # neurons in input Lc = # neurons in context

The context layer carries memory

The hidden layer activates output layer and refreshes context layer

Desired behavior after training process

xx

twUtwUtwH hchi

3

2tanh7159.1)(

))1()(())((

))((())1(()1( twHUtwEtw oh

The word corpus – Elman’s idea

All words are coded with certain given lexical codes and all word sequences in corpus D follow the syntax (Noun + Verb + Noun).

After training, input all sequences again and record all hidden outputs for each individual input.

Obtain new code for nth word by averaging all vectors in

Construct a word tree based on the new codes to explore the relationship between words.

nEn wtwtwHS )(|))((

NntwHS

w

Dtwwtw

En

En

n

1,))((1

)()(

Enw

EnS

Review semantic search

The conventional semantic search constructs a semantic model and a semantic measure.

A manually designed semantic code set by experts is used in the model. (main focus)

One can build a raw semantic matrix W for all N different words

A code of a word is a column vector of R features

One may use the orthogonal space configured by the characteristic decomposition of the matrix, WWT.

NRNNR wwwW 21

TRnnnn wwww 21

The semantic search

Since WWT is a symmetric matrix, all its eigenvalues are real and nonnegative numbers.

Each eigenvalue λi equals the variance of the N projections of the codes on the ith eigenvector, fi, that is,

RR

RRR

TRR

TNRNR FFWW

00

00

00

00

2

1

Rr

ffffF

rr

rRRRRR

1, and

1, where

1

21

2

1

N

nini fw

Multidimensional Scaling (MDS) space

Select a set of Rs eigenvectors {fr, r=1~Rs} from all R eigenvectors to build a reduced feature space

The MDS space is MDS = span{Fs} These selected features are independent and signif

icant. The new code of each word in this space is

sss RRR

s

RRfffF

21

NRss

NR

nss

n

WFW

wFwT

T

or

Representative vector of a document

A representative vector for a document D should contain the semantic meaning of the whole document.

Two measures are defined Peak preferred measure

Average preferred measure

Magnitude is normalized

Rrwwwwwv srn

Dw

ar

TaR

aaaD s

n

1,max where21

RrwwwwwwvDw

srn

br

bR

bb

Dw

sn

bD

sn

sn

1, where21

bD

bDD vvv

1

Representative vector of a document

The normalized measure vD is used to represent a whole document. And a representative vector vQ for a whole query can be obtained by the same way.

The relation score is defined as

QD

QD

QDQ vv

vv

vvDRS ,

,

Iterative re-encoding

Eleman’s method for sentence generation of fixed syntax Noun+Verb+Noun can not be applied to more complex sentences.

We modify his method. Each word has random lexical code initially

After the jth training epoch, a new raw code is calculated

TnRnnjn wwww 21

0

nn

Dtwwtw

ohn

rawn

ss

NntwHUs

wn

set ain wordsof # totalis where

1))),1(((1

)()(

Iterative re-encoding

The set sn contains all prediction for the word wn based on its precedent words.

After each epoch, all the codes are normalized by the following two equations. The normalization prevents a diminished solution derived by the backpropgation algorithm.

DtwwtwtwHUs nohn )(,)(|)))1(((

Nnwwwwwww

WN

WW

nTnn

aven

aven

nomn

jn

NN

rawNR

rawNR

aveNR

1,)( where,

11

1

11

1

5.01

Example Test the ability of classifying 36 Shakespeare’s plays.

We consider each play as the query input and calculate the relation score of this and one other play. The figure below shows the relation tree.

c: comedy r: romanceh: history t: tragedyNumber denotes publication year

Model parameters: Di=1…36, Qi=1…36, N=10000, Lh=Lc=200, Lo=Li=Rs=R=64

Example

We provide a semantic search tool using corpus from Shakespeare’s comedies and tragedies at http://red.csie.ntu.edu.tw/demo/literal/SAS.htm

Example search result with parameters Di=1…7777, N=10000, Lo=Li=R=100, Lh=Lc=200, Rs=64

Query Search result

she loves kiss BENVOLIO: Tut, you saw her fair, none else being by herself poised with herself in either eye; but in that crystal scales let there be weigh.d. Your lady.s love against some other maid that I will show you shining at this feast, and she shall scant show well that now shows best.

-Romeo and Juliet

armies die in blood MARCUS AND RONICUS: Which of your hands hath not defended Rome, and rear.d aloft the bloody battle-axe, writing destruction on the enemy.s castle? O, none of both but are of high desert my hand hath been but idle; let it serve. To ransom my two nephews from their death; then have I kept it to a worthy end.

-Titus Andronicus

Summary

We have explored the concept of semantic addressable encoding and completed a design for it that includes automatic encoding methods.

We have presented the result of applying this method in studying literary works.

The trained semantic codes can facilitate other research such as linguistic analysis, authorship identity, categorization, etc.

The method can be modified to accommodate polysemous words.

Download - Semantic Addressable Encoding

Top Related