discovering the most potential stars in social networks

37
Discovering the Most Potential Stars in Social Networks Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August 2011)

Upload: karim

Post on 19-Jan-2016

22 views

Category:

Documents


0 download

DESCRIPTION

Discovering the Most Potential Stars in Social Networks. Zhuo Peng , Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August 2011 ). Outline. Introduction Related Work Preliminary Algorithm - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Discovering the Most Potential Stars in Social Networks

Discovering the Most Potential Stars in Social

Networks

Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba

Proceedings of the Third International Conference onEmerging Databases, Incheon, Korea (August 2011)

Page 2: Discovering the Most Potential Stars in Social Networks

2

Introduction Related Work Preliminary Algorithm Experiments Conclusion

Outline

Page 3: Discovering the Most Potential Stars in Social Networks

3

Introduction Related Work Preliminary Algorithm Experiments Conclusion

Outline

Page 4: Discovering the Most Potential Stars in Social Networks

4

Purpose: to find the most potential stars in

social networks to be promoted How to measure the importance

incoming edge and outgoing edge most potential stars = minimum promotion cost How to find the most potential stars

Skyline query promote a non-skyline member into skyline by

adding new edges which are directly connected to it it will take some costs to add a new edge

Introduction

Page 5: Discovering the Most Potential Stars in Social Networks

5

Page 6: Discovering the Most Potential Stars in Social Networks

6

member promotion in SNs = to identify the

most appropriate non-skyline member(s) which can be promoted to be skyline member(s) by adding edges at minimum cost

To the best of our knowledge, our paper is the first one that raises the member promotion problem in SNs.

Problem Definition

Page 7: Discovering the Most Potential Stars in Social Networks

7

first one that raises the member promotion problem

in SNs and provides the formal definition propose the general promotion algorithmic

framework and bring forward the brute-force method for promotion to solve the problem intuitively

utilize several optimization strategies to improve the efficiency and accordingly propose the IDP algorithm

Extensive experiments were conducted to show the effectiveness and efficiency of the IDP algorithm on both real and synthetic datasets

Contributions

Page 8: Discovering the Most Potential Stars in Social Networks

8

Introduction Related Work

Skyline Query Skyline Minimum Vector

Preliminary Algorithm Experiments Conclusion

Outline

Page 9: Discovering the Most Potential Stars in Social Networks

9

retrieves a subset of data points that are not

dominated by any other points in a set of D-dimensional data points

algorithms Block Nested Loop (BNL) Divide-and-Conquer (D&C) Bitmap method Nearest Neighbor (NN) Branched and Bound Skyline(BBS)

Skyline Query

Page 10: Discovering the Most Potential Stars in Social Networks

10

studies the query for the points that can be changed to be a

skyline point at the minimum cost The costs are measured by L1 distance of the skyline vectors

starting from the original position and pointing to a skyline position. The skyline minimum vector thus indicates minimum L1 distance.

Those non-skyline points which can be changed to be skyline points by the skyline minimum vectors are the solutions to the problem.

Drawbacks the virtual points which are needed for the computation of the

skyline vectors must be provided in advance the skip region for optimization is not good enough no theoretical analysis such as time complexity analysis and

correctness proof has been provided

Skyline Minimum Vector

Page 11: Discovering the Most Potential Stars in Social Networks

11

Introduction Related Work Preliminary Algorithm Experiments Conclusion

Outline

Page 12: Discovering the Most Potential Stars in Social Networks

12

An SN is modeled as a directed graph G(V, E,

W) V = the members in the SN E = the existing directed edges Each w ∈ W : V × V→R+ denotes the cost for

establishing the edge between any two different members

Preliminary

Page 13: Discovering the Most Potential Stars in Social Networks

13

An SN is modeled as a directed graph G(V, E,

W) V = the members in the SN E = the existing directed edges Each w ∈ W : V × V→R+ denotes the cost for

establishing the edge between any two different members

Preliminary

Page 14: Discovering the Most Potential Stars in Social Networks

14

Authoritativeness

Given a node v in an SN G(V, E, W), the authoritativeness of v is denoted as the indegree of v, namely din(v)

Shows how much attention v can get Hubness

Given a node v in an SN G(V, E, W), the hubness of v is denoted as the outdegree of v, namely dout(v)

Shows how the importance of v as a hub

Authoritativeness and Hubness

Page 15: Discovering the Most Potential Stars in Social Networks

15

Candidate Set

Given an SN G(V, E, W), let the skyline member set be SG, when SG ≠ V , the set V-SG, denoted as C*, is the candidate set of G. We say each node c ∈ V-SG is a candidate for member promotion

Dominator Set Given a member v in an SN G(V, E, W), the

dominator set of v, marked as δ(v), is defined as a set of nodes D: {n | n dominates v, n ∈ V}.

Candidate Set and Dominator Set

Page 16: Discovering the Most Potential Stars in Social Networks

16

Page 17: Discovering the Most Potential Stars in Social Networks

17

Given an SN G(V, E, W), ∀c ∈ C*, p ⊆ V × V, a

promotion plan against c, denoted as p, is defined as such an edge combination that satisfies: (1) p ⊆ {e | e = (c, ·) ∨ e = (·, c) ∧ e ≠ (c, c) ∧ e

∉ E}, (2) c ∉ SG’, where G’ = (V, E + p, W).

In more general cases, the one which only meets (1) is defined as a plan

Promotion

Page 18: Discovering the Most Potential Stars in Social Networks

18

Given an SN G(V, E, W), the cost of any plan p,

marked as γ(p), is the sum of the weights corresponding to the edges included in p. As we mark the weight of an edge e as ϵ(e), that is,γ(p) =Σe∈p(ϵ(e)) = Σe∈p (W[e.f rom][e.to])

in which e.from and e.to represent the source node and the sink node of edge e respectively. Thereby, ∀ c ∈ C*, p ∈ Pc , the promotion cost of c is the minimum cost among all the promotion plans. We mark it as ζ (c), namely,ζ (c) = minp ∈ Pc (γ(p))

Promotion Cost

Page 19: Discovering the Most Potential Stars in Social Networks

19

Member Promotion in SNs

Given an SN G(V, E, W), member promotion in SNs is to find such a member set R which satisfies:(1) R ⊆ C*,(2) R = {r | r = argmin(ζ(c))}

Problem Statement

Page 20: Discovering the Most Potential Stars in Social Networks

20

Introduction Related Work Preliminary Algorithm Experiments Conclusion

Outline

Page 21: Discovering the Most Potential Stars in Social Networks

21

A general framework for promotion algorithms A brute-force method An index-based dynamic pruning method

Algorithm

Page 22: Discovering the Most Potential Stars in Social Networks

22

1. Offline calculation of the distribution of both

measures of all the members2. Determine the candidate set by skyline query3. Against each candidate, perform promotions

by adding edges in the promotion plans and update the minimum promotion cost if necessary

4. Return the optimal candidate and related optimal promotion plans

General Framework

Page 23: Discovering the Most Potential Stars in Social Networks

23

Page 24: Discovering the Most Potential Stars in Social Networks

24

Verifies all the possible plans with i edges

against all the candidates before we locate the best candidate

Brute-Force Algorithm

Page 25: Discovering the Most Potential Stars in Social Networks

25

Page 26: Discovering the Most Potential Stars in Social Networks

26

A number of “meaningless” promotions will

decrease the efficiency, so we should find a way to recognize the skippable plans for pruning

There are some related theorems and lemmas

IDP : The Index-based Dynamic Pruning

Algorithm

Page 27: Discovering the Most Potential Stars in Social Networks

27

Given an SN G(V, E, W), if adding an edge e

connecting node vi and the candidate node c still cannot promote c into the skyline set, all the attempts of adding an edge e′ connecting the node vj and c with the same direction as e are not able to successfully promote c, where vj ∈ δ(vi)

IDP : Theorem

Page 28: Discovering the Most Potential Stars in Social Networks

28

Assume a plan p including n edges: e1, e2, …,

en cannot get its target candidate c promoted. For each edge ei connecting vi and c in p, let li

be the list containing all the non-existing edges each of which links one member ∈ δ(vi) and c with the same direction as ei (i = 1, 2,… , n). All the plans with n edges which belong to , the Cartesian product of li, can be skipped in the subsequent verification process against c

IDP : Lemma

Page 29: Discovering the Most Potential Stars in Social Networks

29

Skyline may change after applying a plan, thus

the candidate may still be dominated by other members

In the brute-force algorithm is to recalculate the skyline set based on the whole updated network

Theorem Given a plan p, let M be the set of members

relevant to the edges in p except the candidate c. If a member v neither dominates c before the promotion nor belongs to M, v will still not dominate c after p is conducted.

Final Verification

Page 30: Discovering the Most Potential Stars in Social Networks

30

Just need to eliminate the possibility of any

member being a dominator of the candidate c to make sure c is successfully promoted

Two cases the members connected to any edge in the plan

may become new dominators of c because at least one of their two measures will increase after the promotion

the members in the skyline member set may still dominate c

Final Verification

Page 31: Discovering the Most Potential Stars in Social Networks

31

Page 32: Discovering the Most Potential Stars in Social Networks

32

Introduction Related Work Preliminary Algorithm Experiments Conclusion

Outline

Page 33: Discovering the Most Potential Stars in Social Networks

33

Implemented using Java with JDK version

1.6.0_10, Inter Core2 Duo CPU T7300 2.00GHz, 1G memory, 120G hard disk, Running Windows XP

Datasets USAir

Includes 332 nodes and 2126 edges Power-law set

Used a graph data generator gengraph_win to generate graph datasets

Experimental Settings

Page 34: Discovering the Most Potential Stars in Social Networks

34

we verified the effectiveness by comparing

the promotion costs between the IDP algorithm and a random promotion algorithm

Comparison on Promotion Cost

Page 35: Discovering the Most Potential Stars in Social Networks

35

to compare the time cost of the brute-force

algorithm and the IDP algorithm on both USAir and Power-law Set respectively

Comparison on Time Cost

Page 36: Discovering the Most Potential Stars in Social Networks

36

Introduction Related Work Preliminary Algorithm Experiments Conclusion

Outline

Page 37: Discovering the Most Potential Stars in Social Networks

37

Raised a new interesting problem, namely

member promotion in in social networks Purpose two algorithms

the brute-force algorithm the IDP algorithm

The future work Further improve the algorithm Allows several members to promote

concurrently

Conclusion