introducing latent semantic analysis
DESCRIPTION
Introducing Latent Semantic Analysis. Tomas K Landauer et al., “An introduction to latent semantic analysis,” Discourse Processes, Vol. 25 (2-3), pp. 259-284, 1998. - PowerPoint PPT PresentationTRANSCRIPT
Introducing Latent Semantic Analysis
Tomas K Landauer et al., “An introduction to latent semantic analysis,” Discourse Processes, Vol. 25 (2-3), pp. 259-284, 1998.Scott Deerwester et al., “Indexing by latent semantic analysis,” Journal of the American Society for Information Science, Vol. 41 (6), pp. 391-407, 1990.Kirk Baker, “Singular Value Decomposition Tutorial,” Electronic document, 2005.
Aug 22, 2014Hee-Gook Jun
2 / 25
Outline
SVD SVD to LSA Conclusion
3 / 25
Eigendecomposition vs. Singular Value Decomposition
Eigendecomposition– Must be a diagonalizable matrix– Must be a square matrix– Matrix (n x n size) must have n linearly independent eigenvector
e.g. symmetric matrix ..
Singular Value Decomposition– Computable for any size (M x n) of matrix
A U ∑ VT
A P Ʌ P-1
4 / 25
U: Left Singular Vectors of A
Unitary matrix– Columns of U are orthonormal (orthogonal + normal)– orthonormal eigenvectors of AAT
A U ∑ VT
and is orthogonal
= [0,0,0,1] = [0,1,0,0]
= (0x0) + (0x1) + (0x0) + (1x0)
= 0
is normal vector
= [0,0,0,1]
|| =
= 1
5 / 25
V: Right Singular Vectors of A
Unitary matrix– Columns of V are orthonormal (orthogonal + normal)– orthonormal eigenvectors of ATA
A U ∑ VT
6 / 25
∑ (or S)
Diagonal Matrix– Diagonal entries are the singular values of A
Singular values– Non-zero singular values– Square roots of eigenvalues from U (or V) in descending order
A U ∑ VT
7 / 25
Calculation Procedure
1. U is a list of eigenvectors of AAT
1. Compute AAT
2. Compute eigenvectors of AAT
3. Matrix Orthonormalization
2. V is a list of eigenvectors of ATA1. Compute ATA2. Compute eigenvalues of ATA3. Orthonormalize and transpose
3. ∑ is a list of eigenvalues of U or V1. (eigenvalues of U = eigenvalues of V)
A U ∑ VT
① ② ③
8 / 25
1.1 Matrix U – Compute AAT
Start with the matrix
Transpose of A
Then
9 / 25
1.2 Matrix U – Eigenvectors and Eigenvalues [1/2]
Eigenvector– Nonzero vector that satisfies the equation– A is a square matrix, is an eigenvalue (scalar), is the eigenvector
≡rearrange
set determinent of the coefficient matrix to zero
10 / 25
1.2 Matrix U – Eigenvectors and Eigenvalues [2/2]
Thus, set of eigenvectors [𝟏 𝟏𝟏 −𝟏]
② For
Calculated eigenvalues
① For
eigenvector
eigenvector
11 / 25
1.3 Matrix U – Orthonormalization
Gram-Schmidt orthonormalization
𝑤𝑘=𝑣𝑘−∑𝑖=1
𝑘−1
(𝑢𝑖 ∙𝑣𝑘 )×𝑢𝑖
set of eigenvectors orthonormal matrix
𝑣1𝑣2 𝑢1𝑢2
normalize v1
normalize w2
find w2 (orthogonal to u1)
12 / 25
2.1 Matrix VT – Compute ATA
Start with the matrix
Transpose of A
Then
13 / 25
2.2 Matrix VT – Eigenvectors and Eigenvalues [1/2]
Eigenvector– Nonzero vector that satisfies the equation– A is a square matrix, is an eigenvalue (scalar), is the eigenvector
≡rearrange
set determinent of the coefficient matrix to zeroby cofactor expansion ( 여인수 전개 )
14 / 25
2.2 Matrix VT – Eigenvectors and Eigenvalues [2/2]
Thus, set of eigenvectors
② For
① For eigenvector
[𝟏 𝟏𝟏 −𝟏]
③ For
15 / 25
2.3 Matrix VT – Orthonormalization and Transformation
Gram-Schmidtorthonormalization
𝑤𝑘=𝑣𝑘−∑𝑖=1
𝑘−1
(𝑢𝑖 ∙𝑣𝑘 )×𝑢𝑖
set of eigenvectors orthonormal matrix
𝑣1𝑣2𝑣3 𝑢1𝑢2𝑢3
normalize v1
normalize w2find w2 (orthogonal to u1)
normalize w3
find w3 (orthogonal to u2)
Transpose
16 / 25
3.1 Matrix ∑ (= S)
Square roots of the non-zero eigenvalues– Populate the diagonal with the values– Diagonal entries in ∑ are the singular values of A
17 / 25
Outline
SVD SVD to LSA
18 / 25
Latent Semantic Analysis
Use SVD (Singular Value Decomposition)– to simulate human learning of word and passage meaning
Represent word and passage meaning– as high-dimensional vectors in the semantic space
19 / 25
LSA Example
doc 1 " modem the steering linux. modem, linux the modem. steering the modem. linux "
doc 2 " linux; the linux. the linux modem linux. the modem, clutch the modem. petrol "
doc 3 " petrol! clutch the steering, steering, linux. the steering clutch petrol. clutch the petrol; the clutch "
doc 4 " the the the. clutch clutch clutch! steering petrol; steering petrol petrol; steering petrol "
First analysis – Document Similarity
Second analysis – Term Similarity
20 / 25
LSA Example: Build a Term Frequency Matrix
d1 d2 d3 d4
linux 3 4 1 0
modem 4 3 0 1
the 3 4 4 3
clutch 0 1 4 3
steer-ing
2 0 3 3
petrol 0 1 3 4
Let Matrix A =
21 / 25
LSA Example: Compute SVD of Matrix A
dim1
dim2
dim3
dim4
t1-
0.33
-0.53
0.36
-0.14
t2-
0.32
-0.53
-0.48
0.35
t3-
0.61
-0.09
0.26
-0.14
t4-
0.37
0.42
0.60
-0.23
t5-
0.35
0.25
-0.68
-0.46
t6-
0.37
0.42
0.01
0.74
dim1
dim2
dim3
dim4
dim1
11.4 0 0 0
dim2
0 6.27 0 0
dim3
0 0 2.21 0
dim4
0 0 0 1.28
d1 d2 d3 d4
dim1
-0.42
-0.56
-0.64
-0.29
dim2
-0.48
-0.52
0.61 0.33
dim3
-0.56
0.44 0.27 -0.63
dim4
-0.51
0.46 -0.35
0.63
d1 d2 d3 d4
t1 (linux) 3 4 1 0
t2 (mo-dem)
4 3 0 1
t3 (the) 3 4 4 3
t4 (clutch) 0 1 4 3
t5 (steer-ing)
2 0 3 3
t6 (petrol) 0 1 3 4
A
x x
6 x 4 4 x 4 4 x 4
U S VT
=
- R code -result ← svd(A)
22 / 25
LSA Example: Reduced SVD
dim1
dim2
dim3
dim4
t1-
0.33
-0.53
0.36
-0.14
t2-
0.32
-0.53
-0.48
0.35
t3-
0.61
-0.09
0.26
-0.14
t4-
0.37
0.42
0.60
-0.23
t5-
0.35
0.25
-0.68
-0.46
t6-
0.37
0.42
0.01
0.74
dim1
dim2
dim3
dim4
dim1
11.4 0 0 0
dim2
0 6.27 0 0
dim3
0 0 2.21 0
dim4
0 0 0 1.28
d1 d2 d3 d4
dim1
-0.42
-0.56
-0.64
-0.29
dim2
-0.48
-0.52
0.61 0.33
dim3
-0.56
0.44 0.27 -0.63
dim4
-0.51
0.46 -0.35
0.63
x x
6 x 4 4 x 4 4 x 4
dim1
dim2
dim3
dim4
t1-
0.33
-0.53
0.36
-0.14
t2-
0.32
-0.53
-0.48
0.35
t3-
0.61
-0.09
0.26
-0.14
t4-
0.37
0.42
0.60
-0.23
t5-
0.35
0.25
-0.68
-0.46
t6-
0.37
0.42
0.01
0.74
dim1
dim2
dim3
dim4
dim1
11.4 0 0 0
dim2
0 6.27 0 0
dim3
0 0 2.21 0
dim4
0 0 0 1.28
d1 d2 d3 d4
dim1
-0.42
-0.56
-0.64
-0.29
dim2
-0.48
-0.52
0.61 0.33
dim3
-0.56
0.44 0.27 -0.63
dim4
-0.51
0.46 -0.35
0.63
x x
6 x 2 2 x 2 2 x 4
23 / 25
LSA Example: Document Similarity
dim1
dim2
dim3
dim4
dim1
11.4 0 0 0
dim2
0 6.27 0 0
dim3
0 0 2.21 0
dim4
0 0 0 1.28
d1 d2 d3 d4
dim1
-0.42
-0.56
-0.64
-0.29
dim2
-0.48
-0.52
0.61 0.33
dim3
-0.56
0.44 0.27 -0.63
dim4
-0.51
0.46 -0.35
0.63
x
2 x 2 2 x 4
S V
=
d1 d2 d3 d4
dim1
-4.83
5.49 -6.49
-5.86
dim2
-3.52
-3.28
2.79 2.88
d1 d2 d3 d4
d1 1 0.99 0.51 0.46
d2 0.99 1 0.58 0.54
d3 0.51 0.58 1 0.99
d4 0.46 0.54 0.99 1
𝑆𝑖𝑚 ( 𝐴 ,𝐵 )=𝑐𝑜𝑠𝑖𝑛𝑒𝜃=𝐴 ∙𝐵
|𝐴||𝐵|=
(−4.83×5.49 )+(−3.52×−3.28)
√ (−4.83 )2+(−3.52 )2×√(5.49 )2+ (−3.28 )2doc 1"modem the steering linux. modem, linux the modem. steering the modem. linux "
doc 2"linux; the linux. the linux modem linux. the modem, clutch the modem. petrol "
doc 3"petrol! clutch the steering, steering, linux. the steering clutch petrol. clutch the petrol; the clutch "
doc 4 "the the the. clutch clutch clutch! steering petrol; steering petrol petrol; steering petrol "
24 / 25
LSA Example: Term Similarity
dim1
dim2
dim3
dim4
t1-
0.33
-0.53
0.36
-0.14
t2-
0.32
-0.53
-0.48
0.35
t3-
0.61
-0.09
0.26
-0.14
t4-
0.37
0.42
0.60
-0.23
t5-
0.35
0.25
-0.68
-0.46
t6-
0.37
0.42
0.01
0.74
dim1
dim2
dim3
dim4
dim1
11.4 0 0 0
dim2
0 6.27 0 0
dim3
0 0 2.21 0
dim4
0 0 0 1.28
x =
S V
𝑆𝑖𝑚 ( 𝐴 ,𝐵 )=𝑐𝑜𝑠𝑖𝑛𝑒𝜃=𝐴 ∙𝐵
|𝐴||𝐵|
dim1
dim2
t1-
3.76
-3.33
t2-
3.65
-3.35
t3-
7.01
-0.61
t4-
4.30
2.63
t5-
4.09
1.59
t6-
4.24
2.65
t1 t2 t3 t4 t5 t6
t1 10.99
0.80
0.29
0.45
0.28
t20.99
1.00
0.79
0.27
0.44
0.26
t30.80
0.79
10.80
0.89
0.79
t40.29
0.27
0.80
10.98
0.99
t50.45
0.44
0.89
0.98
10.98
t60.28
0.26
0.79
0.99
0.98
1
linux modem the clutch steering petrol
linux modem the
modem linux the
the linux modem clutch steering petrol
clutch the steering petrol
steering the clutch petrol
petrol the clutch steering
25 / 25
Conclusion
Pros– Compute document similarity– even if they do not have common words
Cons– Statistical foundation missing → PLSA
dim1
dim2
dim3
dim4
t1-
0.33
-0.53
0.36
-0.14
t2-
0.32
-0.53
-0.48
0.35
t3-
0.61
-0.09
0.26
-0.14
t4-
0.37
0.42
0.60
-0.23
t5-
0.35
0.25
-0.68
-0.46
t6-
0.37
0.42
0.01
0.74
dim1
dim2
dim3
dim4
dim1
11.4 0 0 0
dim2
0 6.27 0 0
dim3
0 0 2.21 0
dim4
0 0 0 1.28
d1 d2 d3 d4
dim1
-0.42
-0.56
-0.64
-0.29
dim2
-0.48
-0.52
0.61 0.33
dim3
-0.56
0.44 0.27 -0.63
dim4
-0.51
0.46 -0.35
0.63
x x
Which one is to be chosen to reduce?