an introduction to graph mining - leiden...

49
An Introduction to Graph Mining An Introduction to Graph Mining Hossein Rahmani [email protected] 8 December 2009 Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 1 / 49

Upload: others

Post on 13-Mar-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

An Introduction to Graph Mining

Hossein Rahmani

[email protected]

8 December 2009

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 1 / 49

Page 2: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Outline

From Data Mining To Graph MiningGraph AlgorithmsFunction Prediction in PPI Networks

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 2 / 49

Page 3: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

From Data Mining To Graph Mining

Data MiningClassificationClusteringAssociation rule learning

Graph MiningPowerful way to representdataoutput: expressed asgraphs

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 3 / 49

Page 4: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Graph Mining Domains

Internet Movie DatabaseWeb DataSocial Networks AnalysisBio-Informatics

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 4 / 49

Page 5: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Internet Movie Database

Movie RecommendationCommunity DetectionPrediction: $2 million in opening weekend

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 5 / 49

Page 6: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Web Mining

Web Content MiningTopic Prediction

Web Structure MiningCommunity Mining

Web Usage MiningWebsite Roadmap

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 6 / 49

Page 7: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Social Networks

Relationships and flowsbetween peopleTechnologies

Email, BlogsSocial Networking Softwarelike Orkut, FaceBook

Questions:Who has control over whatflows in the Network?Who has best visibility of whatis happening in the Network?Customer Network Value

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 7 / 49

Page 8: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Protein-Protein Interaction(PPI) Network

Nodes: ProteinsEdges: Interaction among theProteinsOpen Problems:

Function PredictionDrug Discovery

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 8 / 49

Page 9: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Function Prediction in PPI Networks

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 9 / 49

Page 10: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Drug Discovery in PPI Networks

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 10 / 49

Page 11: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Outline

From Data Mining To Graph MiningGraph Algorithms

Some DefinitionsGraph MatchingGraph CompressionGraph based Decision Tree

Function Prediction in PPI Networks

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 11 / 49

Page 12: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Graph Definition

Graph: G = (V ,E, µ, v)V : finite set of nodes.E ⊆ V× V denotes a set of edges.µ : V → LV denotes a node labeling function.v : E → LE denotes an edge labeling function.

Let G1 = (V1,E1,µ1, v1) and G2 = (V2,E2,µ2, v2)Graph G1 is a subgraph of G2, written G1 ⊆ G2,if:

V1 ⊆ V2E1 ⊆ E2µ1(u) = µ2(u) for all u ∈ V1.v1(u, v) = v2(u, v) for all (u, v) ∈ E1.

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 12 / 49

Page 13: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Graph Definition

Let G1 = (V1,E1,µ1, v1) and G2 = (V2,E2,µ2, v2),A graph isomorphism between G1 and G2 is a bijective functionf : V1 → V2 satisfying

µ1(u) = µ2(f (u)) for all nodes u ∈ V1.For every edge e1 = (u, v) ∈ E1, there exists an edgee2 = (f (u), f (v)) ∈ E2 such that v1(e1) = v2(e2).

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 13 / 49

Page 14: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Graph Definition

Let G1 = (V1,E1,µ1, v1) and G2 = (V2,E2,µ2, v2)Any bijective function f : V̂1 → V̂2, where V̂1 ⊆ V1 and V̂2 ⊆ V2 iscalled edit path from G1 to G2.Example: f = {u1 → v3,u2 → ε, ..., ε→ v6}u1 → v3: Substitution of node u1 ∈ V1 by node v3 ∈ V2

u2 → ε: Deletion of node u2 ∈ V1

ε→ v6: Insertion of v6 ∈ V2

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 14 / 49

Page 15: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Frequent Subgraph

Frequent subgraphs:support (subgraph) >=minimum support

Usage:Graph ClassificationGraph ClusteringGraph Indexing

Detection Algorithms:Apriori-Based ApproachPattern Growth Approach

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 15 / 49

Page 16: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Apriori-Based Approach

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 16 / 49

Page 17: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Pattern Growth Approach

A graph G is extended by adding new edge e.Edge e may or may not introduce a new node to G.

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 17 / 49

Page 18: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Graph Compression

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 18 / 49

Page 19: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Graph Compression

Portion of DNAEasy to Analyze?

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 19 / 49

Page 20: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Graph Compression

Si compresses the inputDNA.Cluster hierarchy

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 20 / 49

Page 21: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Graph Based Decision Tree

Decision TreeEach branch corresponds to attribute valueEach leaf node assigns a classification

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 21 / 49

Page 22: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Graph Based Decision Tree

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 22 / 49

Page 23: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Graph Based Decision Tree

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 23 / 49

Page 24: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Graph Based Decision Tree

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 24 / 49

Page 25: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Outline

From Data Mining To Graph MiningGraph AlgorithmsFunction Prediction in PPI Networks

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 25 / 49

Page 26: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Function Prediction in PPI Networks

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 26 / 49

Page 27: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Formal Description of PPI Networks

Represented as an undirected graphNode set V → ProteinsEdge set E → Direct interaction

∀v ∈ V is described by a description d(v) ∈ Dd(v) derived from the network structureNo additional information, such as the protein structure is available

∀v ∈ V optionally is annotated with a label l(v) ∈ LLabels l(v) are sets of protein functionsE.g., metabolism, transcription, protein synthesis and etc

We assume there is a true labeling function λ that is l(v) = λ(v)where l(v) is definedTask: Find a suitable λ(v) where l(v) is not defined

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 27 / 49

Page 28: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Related Works

1 Transductive approachesLocal: Majority Rule and its extensionsGlobal: Global Optimization andFunctional Clustering

2 Inductive approachesLocal: Topological RedundanciesGlobal: ? → Our Method

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 28 / 49

Page 29: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Majority Rule

Local transductive methodAssumption: Two Interactingproteins have something incommon (e.g., same function)Predicted function: Mostcommon function(s) amongclassified partners

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 29 / 49

Page 30: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Majority Rule

Problem: Links unclassified-unclassified proteins completelyneglectedSolution: Global optimization methods

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 30 / 49

Page 31: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Functional Clustering

Global transductive methodAssumption: Dense regions are a sign ofthe common involvement in biologicalprocessPredict the function of unclassified proteinbased on the cluster they belong to

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 31 / 49

Page 32: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Our Method: A Global Description of Proteins

Global inductive approachNode description

N nodes in the network numbered from 1 to NEach node is described by an n-dimensionalvectori ’th component in the vector of node v gives thelength of shortest path between v and node iProbelm: Large Graph→ very high dimensionaldescriptionsSolution: Reduce dimensionality by focusing onshortest-path distance to a few "important" nodes

Feature selection problem

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 32 / 49

Page 33: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Important Proteins

Definition: Node i is important if the shortest-path distance ofsome node v to node i is likely to be relevant for v ’s classificationFeature fi denotes the shortest-path distance to node iAnova based feature selection

For each function j , let Gj be the set of all proteins that have thatfunction jLet f̄ij be the average fi value in GjLet var(fij ) the variance of the fi in GjAnova (analysis of variance) based relevancy measure:

Ai =Varj [f̄ij ]

Meanj [var(fij )](1)

A high Ai denotes a high relevance of feature fi

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 33 / 49

Page 34: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Important Proteins

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 34 / 49

Page 35: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Comparison of Learners

Random Forest performs best among all the learners in 13 out of17 casesOther 4 cases are all characterized by a very high class skewRandom Forest: Best candidate for learning from this type of data

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 35 / 49

Page 36: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Different Number of Important Proteins

We select the 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80,90, 100, 200 and 300 most important proteins.Describe each protein using its distance to the n most importantproteins.Using random forest for function prediction and record precision,recall, F-measure and AUC.In 4 Datasets for 17 functions.The shape of the curves is qualitatively very similar in all cases.

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 36 / 49

Page 37: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Less than 10 Important Proteins

Bad Bad predictive performance.They do not reach their maximum.

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 37 / 49

Page 38: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

10-50 Important Proteins

There is usually a major improvement in all four metrics.

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 38 / 49

Page 39: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

50-70 Important Proteins

For most of the functions, selecting 50-70 important proteins isenough to obtain good classication results.

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 39 / 49

Page 40: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Conclusions

Graph Mining DomainsIntroduction to Graph Algorithms

Graph MatchingGraph CompressionGraph based Decision Tree

Protein Function prediction

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 40 / 49

Page 41: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Thanks!

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 41 / 49

Page 42: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

References

Cook, D. J. and Holder, L. B. 2006 Mining Graph Data. John Wiley& Sons.Rahmani, H., Blockeel, H. and Bender, A., Proc. 3rd Int.Workshop in Machine Learn. Syst. Biol. (MLSB09) 2009 85 – 94.

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 42 / 49

Page 43: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Transductive Learning

Task: Predict the label of all the nodesInput: G = (V ,E ,d , l) with l a partial functionOutput: Complete version G′ = (V ,E ,d , l ′) with l ′ a completefunction that is consistent with ll ′ should approximates λ by optimization criterion oo expresses our assumption about λ

E.g., directly connected nodes tend to have similar labelsNumber of {v1, v2} edges where l ′(v1) 6= l ′(v2) edges should beminimal

Our assumption about λ is called bias of transductive learner

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 43 / 49

Page 44: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Inductive Learning

Task: Learn a function f : D → L that maps a node descriptiond(v) onto its label l(v)

Input: G = (V ,E ,d , l) with l a partial functionOutput: f : D → L such that f (d(v)) = l(v)

Note: f differs from l in that it maps D, not V , onto LIt can make prediction for node v that is not in the original network,as long as d(v) is known

BiasesTransductive bias: Assumption expressed by optimization criterionoDescription bias DInductive bias: Choice of learning algorithm that is used to learn ffrom (d(v), l(v)) couples

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 44 / 49

Page 45: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Global Optimization

Global transductive methodLinks unclassified/unclassified proteinsalso taken into accountAny probable function assignment to thewhole set of unclassified proteins isconsidered

Counting number of interacting pairswith no common functionsSelect the function assignment withlowest value

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 45 / 49

Page 46: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Topological Approaches

Local inductive methodNode description d(v) is built based on the local neighborhoodCount number of patterns (e.g., graphlet) around the proteinsMake the signature vector for each proteinAssumption: Proteins with high similar signature vector havesame functions

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 46 / 49

Page 47: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Topological Approaches

Some topological patterns (Number of considered patterns = 73)

Orbit: One of the previous patternsSame orbit frequency→ same function

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 47 / 49

Page 48: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Topological Approaches

Local inductive methodNode description d(v) is built based on the local neighborhoodCount number of patterns (e.g., graphlet) around the proteinsMake the signature vector for each proteinAssumption: Proteins with high similar signature vector havesame functions

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 48 / 49

Page 49: An Introduction to Graph Mining - Leiden Universityliacs.leidenuniv.nl/~bakkerem2/dbdm2009/GraphMining.pdf · 2009. 12. 18. · An Introduction to Graph Mining Important Proteins

An Introduction to Graph Mining

Topological Approaches

Some topological patterns (Number of considered patterns = 73)

Orbit: One of the previous patternsSame orbit frequency→ same function

Hossein Rahmani (LIACS) An Introduction to Graph Mining 8 December 2009 49 / 49