graph-based cluster labeling using growing hierarchal som

15
Graph-based cluster labeling using Growing Hierarchal SOM Mahmoud Rafeek Alfarra College Of Science & Technology [email protected] The second International conference of Applied Science & natural Ayman Shehda Ghabayen College Of Science & Technology [email protected] Prepared by:

Upload: conway

Post on 10-Jan-2016

20 views

Category:

Documents


0 download

DESCRIPTION

The second International conference of Applied Science & natural. Graph-based cluster labeling using Growing Hierarchal SOM. Prepared by:. Ayman Shehda Ghabayen College Of Science & Technology [email protected]. Mahmoud Rafeek Alfarra College Of Science & Technology [email protected]. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Graph-based cluster labeling using Growing Hierarchal SOM

Graph-based cluster labeling using Growing Hierarchal SOM

Mahmoud Rafeek AlfarraCollege Of Science & [email protected]

The second International conference of Applied Science & natural

Ayman Shehda GhabayenCollege Of Science & [email protected]

Prepared by:

Page 2: Graph-based cluster labeling using Growing Hierarchal SOM

Out Line

Labeling, What and why ?

Graph based Representation

Growing Hierarchal SOM

Extraction of labeles of clusters

Page 3: Graph-based cluster labeling using Growing Hierarchal SOM

Labeling, What and why ?

Cluster labeling: process tries to select descriptive labels (Key words) for the clusters obtained through a clustering algorithm.

Page 4: Graph-based cluster labeling using Growing Hierarchal SOM

Labeling, What and why ?

Cluster labeling is an increasingly important task that:

1. The document collections grow larger.2. Help To: work with processing of news,

email threads, blogs, reviews, and search results

Page 5: Graph-based cluster labeling using Growing Hierarchal SOM

Labeling, What and why ?

Documents collection

DocumentLabeled Clusters

Preprocessing StepDIG Model

X B

S OL

A

G

CD

Clustering Process

+Labeling

0G00G1

0GsSOM

1G01G1

1Gs

2G12G2

Hierarchal Growing SOM

2G12G2

1G01G1

2G12G2

Page 6: Graph-based cluster labeling using Growing Hierarchal SOM

Graph based Representation

010110

25

96

37

100000

A

B

X

D

NC

S

2,3

3,3

1,3

1,1

ph1

ph2

ph3

ph4

ph5

Page 7: Graph-based cluster labeling using Growing Hierarchal SOM

Graph based Representation

Capture the silent features of the data. DIG Model: a directed graph.

A document is represented as a vector of sentences Phrase indexing information is stored in the graph nodes themselves in the form of document tables.

e1

e0

e2

rafting

adventures

river

Document Table e0 S1(1), S2(2), S3(1)

e0 S2(1)

e2 S1(2)

e1 S4(1)

fishing

DocTFET

1{0,0,3}

2{0,0,2}

3{0,0,1}

S1(2)

#SentencePosition

of term

Page 8: Graph-based cluster labeling using Growing Hierarchal SOM

Graph based Representation

Example Document 1River rafting

Mild river rafting

River rafting trips

Document 2Wild river adventures

River rafting vocation plan

fishing trips

fishing vocation plan

booking fishing trips

river fishing

mild

river

rafting

trips

mildriver

rafting

trips

wild

adventures vocation

plan

wild

plan

mild

river

rafting

trips

adventures

vocation

booking

fishing

+

Page 9: Graph-based cluster labeling using Growing Hierarchal SOM

Growing Hierarchal SOM

Page 10: Graph-based cluster labeling using Growing Hierarchal SOM

Growing Hierarchal SOM

Determining the winning node

v1

v2

v3

v5

v4

v7

e0 v6e0

e1 e5

e3

e2

e4

n-nodes in SOM (Gs)

v1

v2 v5

v7

e0 v6e0

e1 e5

e3

Input Document Graph (Gi)

Phrases Significance

Gi Gs

length

Gi

Page 11: Graph-based cluster labeling using Growing Hierarchal SOM

Growing Hierarchal SOM

Neuron updating in the graph domain

A

B D

C

e0 Xe0

e1 e5

e3

Y

B D

CEe4

e1 e5

e3

Ae2

e2

G1G2

We choose increasing the matching phrases to update graphs due to its affect is more stronger than increasing terms (nodes) also add matching phrases can consider it as add ordered pair of nodes

Page 12: Graph-based cluster labeling using Growing Hierarchal SOM

Over all Document clustering Process

Page 13: Graph-based cluster labeling using Growing Hierarchal SOM

Extracting labeling of clusters

To extract the Key word, we need to build a table for each cluster as the following:

TermTF- Locations{T, L,B,b}

No of matching phrases (MP)

Weight

Weight = (f1*T + f2*L + f3*B+ f4*b) * 0.4 + MP * 0.6

Page 14: Graph-based cluster labeling using Growing Hierarchal SOM

Extracting labeling of clustersT1

T2 T3

T10

T4

T7 T8 T11

T6 T5

T9

TermF-weight# MPNet weight

T212.42 (T2,T3), (T2,T5)4.96 + 1.2 =6.16

T310.22 (T2,T3), (T5,T3)4.08 + 1.2= 5.28

T516.63 (T2,T5), (T8, T5), (T5,T3)6.4+ 1.8= 6.4

T814.41 (T8,T5)5.76+ 0.6=6.36

Page 15: Graph-based cluster labeling using Growing Hierarchal SOM

Thank You … Questions