deep representation learning on large attributed graphs 0 · deep representation learning on large...

1
Deep Representation Learning on Large Attributed Graphs Debora Nozza 1 , Enza Messina 1 1 DISCo, University of Milano - Bicocca, Milan, Italy Introduction and Motivation The research project aims at designing and developing novel unsupervised models for learning a graph representation from large heterogeneous attributed graphs, which comprises both the structure of the graph and the attributes associated with each node. The proposed graph representation learning model will be based on deep learning models, strengthened by an efficient optimization algorithm able to scale for large graphs. While the majority of the research contributions on Deep Learning for Representation Learning focus on efficiently learning good representation for i.i.d. data [1, 2], real-world problems can be represented in various forms and, in particular, relational structures are common representations, e.g. airline networks, publication networks, social and com- munication networks, and the World Wide Web. Dealing with relational structures, such as graphs, is very complex and computationally expensive because of different characteristics of the data, i.e., size, dynamic nature, noise, and heterogeneity [3]. One efficient approach for handling potentially large and complex graph is to learn the graph representations, or Graph Embeddings [4, 5], which assign to each node of the graph a low-dimensional dense vector representation, encoding meaningful information conveyed by the graph. Several graph representation learning approaches at the state of the art focus only on the graph structure to compute graph embeddings [6, 7, 8]. However, nodes in real-world graphs are often associated with a rich set of features or attributes (e.g. text, image, audio), originating the so-called attributed graph. Background concepts The main goal of the research project is to map nodes of an attributed graph into a low-dimensional embedded space that preserves not only the local and global relational structure but also the attribute information. Definition 1. An attributed graph is defined as G = (V, E, Π, S), where V = {v 1 ,...,v n } is the set of nodes, E = {(v i ,v j ) | v i ,v j V } denotes the set of edges between the nodes, Π = {π v 1 ,...,π v n } represents the node attributes related to each node v i and S = {s ij } n i,j =1 is the matrix denoting the weights s ij associated to each edge e ij . Definition 2. The local structure of an attributed graph corresponds to the first- order proximity, which is the local pairwise proximity between two nodes. For each pair of nodes linked by an edge (v i ,v j ), the weight on that edge s ij indicates the first-order proximity between v i and v j . If no edge is observed between v i and v j , s ij = 0. Definition 3. The global structure of an attributed graph corresponds to the second- order proximity. The second-order proximity between a pair of nodes (v i ,v j ) in a graph G represents the similarity between their neighbourhood graph structures. Formally, given a node v i V, let N i = {v j |∃e ij 6=0} be the set of neighbours of node v i . Then, the second-order proximity between v k and v l is determined by the similarity between N k and N l . If N k ∩N l = , the second-order proximity between v l and v k is 0. Definition 4. The attribute information can be taken into account by considering the attribute proximity, which is the similarity between the attribute representation π i and π j of two nodes v i and v j . Project The proposed research project will face the following challenges: I Structure-preserving: graph embeddings should preserve the structure of the graph, which is often complex and highly non-linear. Moreover, how to simultaneously preserve the local and global structure is also a tough problem. I Scalability: most real-world graphs are huge and contain millions of nodes and edges. The graph representation learning model should be scalable and able to process large graphs. I Sparsity: many real-world graphs are often so sparse that considering only (few) ob- served links is not enough to reach a satisfactory performance [6]. I Dimensionality of the embedding: the dimension of the embedded representation should be chosen as a trade-off between reconstruction precision performances and time and space complexity. The choice can also be application-specific depending on the objective task. I Attribute expression: the obtained graph embeddings should be able to directly encode the attribute features in addition to the graph structure information. Implementation The research project will be implemented in Python, by taking advantage of the Keras library for Deep Learning. Keras is a minimalist, highly modular neural networks library written in Python and capable of running on top of either TensorFlow or Theano. Since large attributed graphs are commonly associated with millions of nodes and related attributes, training Deep Learning models on this data is computationally intensive. We gratefully acknowledge the support of the HPI Future SOC Lab, for providing the IT infrastructure and the access to NVIDIA Tesla K80 that permits the realization of this project. Evaluation The evaluation will be performed on several graph mining tasks, i.e., graph reconstruction, link prediction, node classification, clustering, and visualization. The analysis will be con- ducted on real datasets originated from different domains, such as social network, blogs, scientific collaboration networks, and biological interaction networks. Conclusion and Future Works The research project will aim to propose a novel unsupervised model for attributed graph embeddings. It is expected that considering the relational information in addition to the attribute information will provide significant improvements. Future work will be focused on dealing with attributed graphs with probabilistic relation- ships. Creating meaningful embeddings of attributed graphs characterized by noisy and uncertain relations represents a major challenge for tackling real-world problems. References T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” CoRR, vol. abs/1301.3781, 2013. J. Pennington, R. Socher, and C. D. Manning, “Glove: Global vectors for word representation,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, vol. 14, pp. 1532–1543, 2014. S. Chang, W. Han, J. Tang, G.-J. Qi, C. C. Aggarwal, and T. S. Huang, “Heterogeneous network embedding via deep architectures,” in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 119–128, 2015. M. Belkin and P. Niyogi, “Laplacian eigenmaps for dimensionality reduction and data representation,” Neural Computation, vol. 15, no. 6, pp. 1373–1396, 2003. P. Goyal and E. Ferrara, “Graph embedding techniques, applications, and performance: A survey,” arXiv preprint arXiv:1705.02801, 2017. B. Perozzi, R. Al-Rfou, and S. Skiena, “Deepwalk: Online learning of social representations,” in Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710, 2014. J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, and Q. Mei, “LINE: large-scale information network embedding,” in Proceedings of the 24th International Conference on World Wide Web, pp. 1067–1077, 2015. D. Wang, P. Cui, and W. Zhu, “Structural deep network embedding,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1225–1234, 2016. http://www.unimib.it http://www.mind.disco.it

Upload: others

Post on 14-Jun-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Deep Representation Learning on Large Attributed Graphs 0 · Deep Representation Learning on Large Attributed Graphs Debora Nozza1, Enza Messina1 1DISCo, University of Milano - Bicocca,

Deep Representation Learning on

Large Attributed Graphs

Debora Nozza1, Enza Messina1

1DISCo, University of Milano - Bicocca, Milan, Italy

Introduction and Motivation

The research project aims at designing and developing novel unsupervised models forlearning a graph representation from large heterogeneous attributed graphs, whichcomprises both the structure of the graph and the attributes associated with each node.The proposed graph representation learning model will be based on deep learning models,strengthened by an efficient optimization algorithm able to scale for large graphs.

While the majority of the research contributions on Deep Learning for RepresentationLearning focus on efficiently learning good representation for i.i.d. data [1, 2], real-worldproblems can be represented in various forms and, in particular, relational structuresare common representations, e.g. airline networks, publication networks, social and com-munication networks, and the World Wide Web.

Dealing with relational structures, such as graphs, is very complex and computationallyexpensive because of different characteristics of the data, i.e., size, dynamic nature, noise,and heterogeneity [3]. One efficient approach for handling potentially large and complexgraph is to learn the graph representations, or Graph Embeddings [4, 5], whichassign to each node of the graph a low-dimensional dense vector representation, encodingmeaningful information conveyed by the graph.

Several graph representation learning approaches at the state of the art focus only onthe graph structure to compute graph embeddings [6, 7, 8]. However, nodes in real-worldgraphs are often associated with a rich set of features or attributes (e.g. text, image,audio), originating the so-called attributed graph.

Background concepts

The main goal of the research project is to map nodes of an attributed graph into alow-dimensional embedded space that preserves not only the local and global relationalstructure but also the attribute information.

Definition 1. An attributed graph is defined as G = (V,E,Π,S), whereV = {v1, . . . , vn} is the set of nodes, E = {(vi, vj) | vi, vj ∈ V } denotes the set ofedges between the nodes, Π = {πv1

, . . . , πvn} represents the node attributes related toeach node vi and S = {sij}ni,j=1 is the matrix denoting the weights sij associated to eachedge eij.

Definition 2. The local structure of an attributed graph corresponds to the first-order proximity, which is the local pairwise proximity between two nodes. For eachpair of nodes linked by an edge (vi, vj), the weight on that edge sij indicates the first-orderproximity between vi and vj. If no edge is observed between vi and vj, sij = 0.

Definition 3. The global structure of an attributed graph corresponds to the second-order proximity. The second-order proximity between a pair of nodes (vi, vj) in a graphG represents the similarity between their neighbourhood graph structures. Formally, givena node vi ∈ V, let Ni = {vj| ∃eij 6= 0} be the set of neighbours of node vi. Then, thesecond-order proximity between vk and vl is determined by the similarity between Nk andNl. If Nk ∩Nl = ∅, the second-order proximity between vl and vk is 0.

Definition 4. The attribute information can be taken into account by considering theattribute proximity, which is the similarity between the attribute representation πi andπj of two nodes vi and vj.

Project

The proposed research project will face the following challenges:

I Structure-preserving: graph embeddings should preserve the structure of the graph,which is often complex and highly non-linear. Moreover, how to simultaneously preservethe local and global structure is also a tough problem.

I Scalability: most real-world graphs are huge and contain millions of nodes and edges.The graph representation learning model should be scalable and able to process largegraphs.

I Sparsity: many real-world graphs are often so sparse that considering only (few) ob-served links is not enough to reach a satisfactory performance [6].

I Dimensionality of the embedding: the dimension of the embedded representationshould be chosen as a trade-off between reconstruction precision performances and timeand space complexity. The choice can also be application-specific depending on theobjective task.

I Attribute expression: the obtained graph embeddings should be able to directlyencode the attribute features in addition to the graph structure information.

Implementation

The research project will be implemented in Python, by taking advantage of the Keraslibrary for Deep Learning. Keras is a minimalist, highly modular neural networks librarywritten in Python and capable of running on top of either TensorFlow or Theano.

Since large attributed graphs are commonly associated with millions of nodes and relatedattributes, training Deep Learning models on this data is computationally intensive. Wegratefully acknowledge the support of the HPI Future SOC Lab, for providing the ITinfrastructure and the access to NVIDIA Tesla K80 that permits the realization of thisproject.

Evaluation

The evaluation will be performed on several graph mining tasks, i.e., graph reconstruction,link prediction, node classification, clustering, and visualization. The analysis will be con-ducted on real datasets originated from different domains, such as social network, blogs,scientific collaboration networks, and biological interaction networks.

Conclusion and Future Works

The research project will aim to propose a novel unsupervised model for attributed graphembeddings. It is expected that considering the relational information in addition to theattribute information will provide significant improvements.Future work will be focused on dealing with attributed graphs with probabilistic relation-ships. Creating meaningful embeddings of attributed graphs characterized by noisy anduncertain relations represents a major challenge for tackling real-world problems.

References

T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” CoRR, vol. abs/1301.3781, 2013.

J. Pennington, R. Socher, and C. D. Manning, “Glove: Global vectors for word representation,” in Proceedings of the 2014 Conference on Empirical Methods inNatural Language Processing, vol. 14, pp. 1532–1543, 2014.

S. Chang, W. Han, J. Tang, G.-J. Qi, C. C. Aggarwal, and T. S. Huang, “Heterogeneous network embedding via deep architectures,” in Proceedings of the 21th ACMSIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 119–128, 2015.

M. Belkin and P. Niyogi, “Laplacian eigenmaps for dimensionality reduction and data representation,” Neural Computation, vol. 15, no. 6, pp. 1373–1396, 2003.

P. Goyal and E. Ferrara, “Graph embedding techniques, applications, and performance: A survey,” arXiv preprint arXiv:1705.02801, 2017.

B. Perozzi, R. Al-Rfou, and S. Skiena, “Deepwalk: Online learning of social representations,” in Proceedings of the 20th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining, pp. 701–710, 2014.

J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, and Q. Mei, “LINE: large-scale information network embedding,” in Proceedings of the 24th International Conference onWorld Wide Web, pp. 1067–1077, 2015.

D. Wang, P. Cui, and W. Zhu, “Structural deep network embedding,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discoveryand Data Mining, pp. 1225–1234, 2016.

http://www.unimib.it http://www.mind.disco.it