Download - Semantic Addressable Encoding
Semantic Addressable Encoding
Cheng-Yuan Liou, Jau-Chi Huang, and Wen-Chie Yang
Department of Computer Science and Information EngineeringNational Taiwan University TC402, Oct. 5, ICONIP 2006, Hong Kong
Web red.csie.ntu.edu.tw
Sentence generating function The semantic world of Mark Twain Semantic Search under Shakespeare
Outline
Introduction Encoding Method
Elman network The word corpus – Elman’s idea Review semantic search Multidimensional Scaling (MDS) space Representative vector of a document Iterative re-encoding
Example Summary
Introduction
A central problem in semantic analysis is to effectively encoding and extracting the contents of word sequences.
Traditional way of creating a prime semantic space is extremely expensive and complex because experienced linguists are required to analyze huge number of words.
This paper represents an automatic encoding process.
Elman Network
Uoh: Lh x Lo weight matrix Uhi: Li x Lh weight matrix Uhc: Lc x Lh weight matrix
Ll = # neurons in output Lh = # neurons in hidden Li = # neurons in input Lc = # neurons in context
The context layer carries memory
The hidden layer activates output layer and refreshes context layer
Desired behavior after training process
xx
twUtwUtwH hchi
3
2tanh7159.1)(
))1()(())((
))((())1(()1( twHUtwEtw oh
The word corpus – Elman’s idea
All words are coded with certain given lexical codes and all word sequences in corpus D follow the syntax (Noun + Verb + Noun).
After training, input all sequences again and record all hidden outputs for each individual input.
Obtain new code for nth word by averaging all vectors in
Construct a word tree based on the new codes to explore the relationship between words.
nEn wtwtwHS )(|))((
NntwHS
w
Dtwwtw
En
En
n
1,))((1
)()(
Enw
EnS
Review semantic search
The conventional semantic search constructs a semantic model and a semantic measure.
A manually designed semantic code set by experts is used in the model. (main focus)
One can build a raw semantic matrix W for all N different words
A code of a word is a column vector of R features
One may use the orthogonal space configured by the characteristic decomposition of the matrix, WWT.
NRNNR wwwW 21
TRnnnn wwww 21
The semantic search
Since WWT is a symmetric matrix, all its eigenvalues are real and nonnegative numbers.
Each eigenvalue λi equals the variance of the N projections of the codes on the ith eigenvector, fi, that is,
RR
RRR
TRR
TNRNR FFWW
00
00
00
00
2
1
Rr
ffffF
rr
rRRRRR
1, and
1, where
1
21
2
1
N
nini fw
Multidimensional Scaling (MDS) space
Select a set of Rs eigenvectors {fr, r=1~Rs} from all R eigenvectors to build a reduced feature space
The MDS space is MDS = span{Fs} These selected features are independent and signif
icant. The new code of each word in this space is
sss RRR
s
RRfffF
21
NRss
NR
nss
n
WFW
wFwT
T
or
Representative vector of a document
A representative vector for a document D should contain the semantic meaning of the whole document.
Two measures are defined Peak preferred measure
Average preferred measure
Magnitude is normalized
Rrwwwwwv srn
Dw
ar
TaR
aaaD s
n
1,max where21
RrwwwwwwvDw
srn
br
bR
bb
Dw
sn
bD
sn
sn
1, where21
bD
bDD vvv
1
Representative vector of a document
The normalized measure vD is used to represent a whole document. And a representative vector vQ for a whole query can be obtained by the same way.
The relation score is defined as
QD
QD
QDQ vv
vv
vvDRS ,
,
Iterative re-encoding
Eleman’s method for sentence generation of fixed syntax Noun+Verb+Noun can not be applied to more complex sentences.
We modify his method. Each word has random lexical code initially
After the jth training epoch, a new raw code is calculated
TnRnnjn wwww 21
0
nn
Dtwwtw
ohn
rawn
ss
NntwHUs
wn
set ain wordsof # totalis where
1))),1(((1
)()(
Iterative re-encoding
The set sn contains all prediction for the word wn based on its precedent words.
After each epoch, all the codes are normalized by the following two equations. The normalization prevents a diminished solution derived by the backpropgation algorithm.
DtwwtwtwHUs nohn )(,)(|)))1(((
Nnwwwwwww
WN
WW
nTnn
aven
aven
nomn
jn
NN
rawNR
rawNR
aveNR
1,)( where,
11
1
11
1
5.01
Example Test the ability of classifying 36 Shakespeare’s plays.
We consider each play as the query input and calculate the relation score of this and one other play. The figure below shows the relation tree.
c: comedy r: romanceh: history t: tragedyNumber denotes publication year
Model parameters: Di=1…36, Qi=1…36, N=10000, Lh=Lc=200, Lo=Li=Rs=R=64
Example
We provide a semantic search tool using corpus from Shakespeare’s comedies and tragedies at http://red.csie.ntu.edu.tw/demo/literal/SAS.htm
Example search result with parameters Di=1…7777, N=10000, Lo=Li=R=100, Lh=Lc=200, Rs=64
Query Search result
she loves kiss BENVOLIO: Tut, you saw her fair, none else being by herself poised with herself in either eye; but in that crystal scales let there be weigh.d. Your lady.s love against some other maid that I will show you shining at this feast, and she shall scant show well that now shows best.
-Romeo and Juliet
armies die in blood MARCUS AND RONICUS: Which of your hands hath not defended Rome, and rear.d aloft the bloody battle-axe, writing destruction on the enemy.s castle? O, none of both but are of high desert my hand hath been but idle; let it serve. To ransom my two nephews from their death; then have I kept it to a worthy end.
-Titus Andronicus
Summary
We have explored the concept of semantic addressable encoding and completed a design for it that includes automatic encoding methods.
We have presented the result of applying this method in studying literary works.
The trained semantic codes can facilitate other research such as linguistic analysis, authorship identity, categorization, etc.
The method can be modified to accommodate polysemous words.