relational duality: unsupervised extraction of semantic relations between entities on the web

29
Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web Danushka Bollegala Yutaka Matsuo Mitsuru Ishizuka International World Wide Web Conference 2010 Rayleigh, North Carolina, USA

Upload: isla

Post on 24-Feb-2016

52 views

Category:

Documents


0 download

DESCRIPTION

Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web. Danushka Bollegala Yutaka Matsuo Mitsuru Ishizuka. International World Wide Web Conference 2010 Rayleigh, North Carolina, USA . Relation Extraction on the Web. Problem definition - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Relational Duality:Unsupervised Extraction of Semantic Relations

between Entities on the Web

Danushka BollegalaYutaka Matsuo

Mitsuru Ishizuka

International World Wide Web Conference 2010Rayleigh, North Carolina, USA

Page 2: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Relation Extraction on the Web Problem definition

Given a crawled corpus of Web text, identify all the different semantic relations that exist between entities mentioned in the corpus.

Challenges The number or the types of the relations that exist in the

corpus are not known in advance Costly, if not impossible to create training data Entity name variants must be handled

Will Smith vs. William Smith vs. fresh prince,… Paraphrases of surface forms must be handled

acquired by, purchased by, bought by,… Multiple relations can exist between a single pair of

entities

Page 3: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Relational Duality

ACQUISITION

(Microsoft, Powerset)(Google, YouTube)

X acquires YX buys Y for $ …

Extensional definition Intensional definitionDUALITY

Page 4: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Overview of the proposed method

Web crawler

Text Corpus

Sentence splitter

POS Tagger

NP chunker

Pattern extracto

r

Lexical patterns

Syntactic patterns

Entity pairs vs. Patterns Matrix

(Google, YouTube)(Microsoft, Powerset)

:::

X ac

quire

s YX

buys

Y: :Sequential

Co-clustering Algorithm

Entity pair clusters

Lexico-syntactic pattern clusters

Cluster labeler (L1 regularized

multi-class logistic

regression)

Page 5: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Lexico-Syntactic Pattern Extraction Replace the two entities in a sentence by X

and Y Generate subsequences (over tokens and POS

tags) A subsequence must contain both X and Y The maximum length of a subsequence must be L

tokens A skip should not exceed g tokens Total number of tokens skipped must not exceed G Negation contractions are expanded and are not

skipped Example

… merger/NN is/VBZ software/NN maker/NN [Adobe/NNP System/NN] acquisition/NN of/IN [Macromedia/NNP]

X acquisition of Y, software maker X acquisition of Y X NN IN Y, NN NN X NN IN Y

Page 6: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Entity pairs vs. lexico-syntactic pattern matrix Select the most frequent entity pairs and

patterns, and create an entity-pair vs. pattern matrix.

Entity pairs vs. Patterns Matrix

(Google, YouTube)(Microsoft, Powerset)

:::

X ac

quire

s YX

buys

Y: :

Page 7: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Sequential Co-clustering Algorithm1. Input: A data matrix, row and column clustering thresholds2. Sort the rows and columns of the matrix in the

descending order of their total frequencies.3. for rows and columns do:

Compute the similarity between current row (column) and the existing row (column) clusters

If maximum similarity < row (column) clustering threshold: Create a new row (column) cluster with the current row (column)

else: Assign the current row (column) to the cluster with the maximum similarity

repeat until all rows and columns are clustered4. return row and column clusters

Page 8: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Sequential Co-clustering Algorithm

0 5 0 2 1

0 8 0 3 2

5 0 1 1 0

6 0 8 2 0(Google, YouTube)

(Microsoft, Powerset)

(Balmer, Microsoft)

(Jobs, Apple)X

acqu

ired

Y

Y CE

O X

X bu

ys Y

for $

X of

Y

Y he

ad X

=8

=13

=7

=16

Row clustering threshold = column clustering threshold = 0.5

Page 9: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Sequential Co-clustering Algorithm

0 5 0 2 1

0 8 0 3 2

5 0 1 1 0

6 0 8 2 0(Google, YouTube)

(Microsoft, Powerset)

(Balmer, Microsoft)

(Jobs, Apple)

X ac

quire

d Y

Y CE

O X

X bu

ys Y

for $

X of

Y

Y he

ad X

=11 =13 =9 =8 =3

Page 10: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Sequential Co-clustering Algorithm

5 0 0 2 1

8 0 0 3 2

0 5 1 1 0

0 6 8 2 0(Google, YouTube)

(Microsoft, Powerset)

(Balmer, Microsoft)

(Jobs, Apple)X

acqu

ired

Y

Y CE

O X

X bu

ys Y

for $

X of

Y

Y he

ad X

=11=13 =9 =8 =3

Page 11: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Sequential Co-clustering Algorithm

5 0 0 2 1

8 0 0 3 2

0 5 1 1 0

0 6 8 2 0(Google, YouTube)

(Microsoft, Powerset)

(Balmer, Microsoft)

(Jobs, Apple)X

acqu

ired

Y

Y CE

O X

X bu

ys Y

for $

X of

Y

Y he

ad X

Page 12: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Sequential Co-clustering Algorithm

5 0 0 2 1

8 0 0 3 2

0 5 1 1 0

0 6 8 2 0[(Google, YouTube)]

(Microsoft, Powerset)

(Balmer, Microsoft)

(Jobs, Apple)X

acqu

ired

Y

Y CE

O X

X bu

ys Y

for $

X of

Y

Y he

ad X

Page 13: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Sequential Co-clustering Algorithm

5 0 0 2 1

8 0 0 3 2

0 5 1 1 0

0 6 8 2 0[(Google, YouTube)]

(Microsoft, Powerset)

(Balmer, Microsoft)

(Jobs, Apple)X

acqu

ired

Y

[Y C

EO X

]

X bu

ys Y

for $

X of

Y

Y he

ad X

0.067 < 0.5

Page 14: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Sequential Co-clustering Algorithm

5 0 0 2 1

8 0 0 3 2

0 5 1 1 0

0 6 8 2 0[(Google, YouTube)]

(Microsoft, Powerset)

[(Balmer, Microsoft)]

(Jobs, Apple)X

acqu

ired

Y

[Y C

EO X

]

X bu

ys Y

for $

X of

Y

Y he

ad X

0<0.5

Page 15: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Sequential Co-clustering Algorithm

5 0 0 2 1

8 0 0 3 2

0 5 1 1 0

0 6 8 2 0[(Google, YouTube)]

(Microsoft, Powerset)

[(Balmer, Microsoft)]

(Jobs, Apple)[X

acq

uire

d Y]

[Y C

EO X

]

X bu

ys Y

for $

X of

Y

Y he

ad X

0.071

0.998 >0.5

Page 16: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Sequential Co-clustering Algorithm

13 0 0 5 3

0 5 1 1 0

0 6 8 2 0[(Google, YouTube)]

(Microsoft, Powerset)

[(Balmer, Microsoft), (Jobs,Apple)]

[X a

cqui

red

Y]

[Y C

EO X

]

X bu

ys Y

for $

X of

Y

Y he

ad X

0

0.84 > 0.5

Page 17: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Sequential Co-clustering Algorithm

13 0 5 3 0 6 1 0

0 14 2 0[(Google, YouTube)]

(Microsoft, Powerset)

[(Balmer, Microsoft), (Jobs,Apple)]

[X a

cqui

red

Y,X

buys

Y fo

r $]

[Y C

EO X

]

X of

Y

Y he

ad X

0.99 > 0.5

0.05

Page 18: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Sequential Co-clustering Algorithm

13 0 5 3

0 20 3 0

[(Google, YouTube),(Microsoft, Powerset)]

[(Balmer, Microsoft), (Jobs,Apple)]

[X a

cqui

red

Y,X

buys

Y fo

r $]

[Y C

EO X

]

X of

Y

Y he

ad X

0.85 > 0.5 0.51

Page 19: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Sequential Co-clustering Algorithm

18 0 3

3 20 0

[(Google, YouTube),(Microsoft, Powerset)]

[(Balmer, Microsoft), (Jobs,Apple)]

[X a

cqui

red

Y,X

buys

Y fo

r $]

[Y C

EO X

,X

of Y

]

Y he

ad X

0.98 > 0.5 0

Page 20: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Sequential Co-clustering Algorithm

21 0

3 20

[(Google, YouTube),(Microsoft, Powerset)]

[(Balmer, Microsoft), (Jobs,Apple)]

[X a

cqui

red

Y,X

buys

Y fo

r $]

[Y C

EO X

,X

of Y

, Y h

ead

X]

Entity pair clusters

Lexical-syntactic pattern clusters

A greedy clustering algorithmAlternates between rows and columnsComplexity O(nlogn)Common relations are clustered firstThe no. of clusters is not requiredTwo thresholds to determine

Page 21: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Estimating the Clustering Thresholds Ideally each cluster must represent a unique semantic

relation Number of clusters = Number of semantic relations Number of semantic relations is unknown Thresholds can be either estimated via cross-validation

(requires training data) OR approximated using the similarity distribution.

Similarity distribution is approximated using a Zeta distribution (Zipf’s law)Ideal clustering:

inter-cluster similarity = 0 intra-cluster similarity =meanwith a large number of data points:

average similarity in a cluster ≥ threshold threshold ≈ distribution mean

Page 22: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Measuring Relational Similarity Empirically evaluate the clusters produced

Use the clusters to measure relational similarity (Bollegala, WWW 2009)

Distance = ENT dataset: 5 relation types, 100 instances Task: query using each entity pair and rank using relational

distanceRelation VSM LRA EUC RELSIM Proposed

ACQUSITION 0.92 0.92 0.91 0.94 0.89HEADQUARTERS 0.84 0.82 0.79 0.86 0.97FIELD 0.44 0.43 0.51 0.57 0.42CEO 0.95 0.96 0.90 0.95 0.99BIRTHPLACE 0.27 0.27 0.33 0.36 0.53Overall Average Precision

0.68 0.68 0.69 0.74 0.76

Page 23: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Entity pair clusters

Self-supervised Relation Detection What is the relation represented by a cluster?

Label each cluster with a lexical pattern selected from that cluster

C1 C2 Ck

Entity pairs vs. Patterns Matrix

(Google, YouTube)(Microsoft, Powerset)

:::

X ac

quire

s YX

buys

Y: :

(Google,YouTube)=[X acquired Y:10,…]

Train an L1 regularized multi-class logistic regression Model (MaxEnt) to discriminate the k-classes.Select the highest weighted lexical patterns from each class

Page 24: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Subjective Evaluation of Relation Labels Baseline

Select the most frequent lexical pattern in a cluster as its label Ask three human judges to assign grades

A: baseline is better B: proposed method is better C: both equally good D: both badRelation A B C DACQUSITION 16.7% 40% 40% 3.3%HEADQAURTERS 20% 40% 23.3% 16.7%CEO 6.7% 53.3% 20% 20%FIELD 13.3% 56.7% 23.3% 6.7%BIRTHPLACE 13.3% 36.7% 10% 40%Overall 14% 45.3% 23.3% 17.3%

Page 25: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Open Information Extraction Sent500 dataset (Banko and Etzioni, ACL

2008) 500 sentences, 4 relation types Lexical patterns 947, syntactic patterns 384 4 row clusters, 14 column clustersMethod Precision Recall F

O-NB 0.866 0.232 0.366O-CRF 0.883 0.452 0.598MLN 0.798 0.733 0.764PROP (lexical) 0.943 0.647 0.767PROP (syntactic) 0.752 0.860 0.802PROP (lexical + syntactic) 0.751 0.857 0.801

Page 26: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Classifying Relations in a Social Network

spysee.jp

Page 27: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Relation Classification Dataset

790,042 nodes (people), 61,339,833 edges (relations)

Randomly select 50,000 edges and manually classify into 53 classes

11,193 lexical patterns, 383 pattern clusters, 664 entity pair clusters

Relation P R F Relation P R Fcolleagues

0.76 0.87 0.81 friends 0.58 0.77 0.66

alumni 0.83 0.68 0.75 co-actors 0.75 0.74 0.74fan 0.91 0.50 0.64 teacher 0.83 0.73 0.78husband 0.89 0.57 0.74 wife 0.67 0.34 0.45brother 0.79 0.60 0.68 sister 0.90 0.52 0.66Micro 0.72 0.68 0.70 Macro 0.78 0.52 0.63

Page 28: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Conclusions Dual representation of semantic relations leads

to a natural co-clustering algorithm. Clustering both entity pairs and lexico-syntactic

patterns simultaneously helps to overcome data sparseness in both dimensions.

Co-clustering algorithm scales nlog(n) with data Clusters produced can be used to:

Measure relational similarity with performance comparable to supervised approaches

Open Information Extraction Tasks Classify relations found in a social network.

Page 29: Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web

Thank You

29