perspective sampling: a visualization march 28, 2017...

21
Evaluation of Graph Sampling: A Visualization Perspective Paper by: Yanhong Wu, Nan Cao, Daniel Archambault, Qiaomu Shen, Huamin Qu, and Weiwei Cui Presentation by: Austin Wallace. March 28, 2017

Upload: others

Post on 13-Mar-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Perspective Sampling: A Visualization March 28, 2017 ...tmm/courses/547-17/slides/austin-graphsampling.pdf · Presentation by: Austin Wallace. March 28, 2017. What’s better, B or

Evaluation of Graph Sampling: A Visualization Perspective

Paper by: Yanhong Wu, Nan Cao, Daniel Archambault, Qiaomu Shen, Huamin Qu, and Weiwei Cui

Presentation by: Austin Wallace.March 28, 2017

Page 2: Perspective Sampling: A Visualization March 28, 2017 ...tmm/courses/547-17/slides/austin-graphsampling.pdf · Presentation by: Austin Wallace. March 28, 2017. What’s better, B or

What’s better, B or C?

2

Page 3: Perspective Sampling: A Visualization March 28, 2017 ...tmm/courses/547-17/slides/austin-graphsampling.pdf · Presentation by: Austin Wallace. March 28, 2017. What’s better, B or

A little different, right?

● Similar quantitative statistics ● Very different perceptually

3

Page 4: Perspective Sampling: A Visualization March 28, 2017 ...tmm/courses/547-17/slides/austin-graphsampling.pdf · Presentation by: Austin Wallace. March 28, 2017. What’s better, B or

Problem: Analyzing large graphs

● Large graphs are difficult to analyze even with state of the art techniques on high-end clusters

● Can reach hundreds of millions, or even billions of nodes

4

Page 5: Perspective Sampling: A Visualization March 28, 2017 ...tmm/courses/547-17/slides/austin-graphsampling.pdf · Presentation by: Austin Wallace. March 28, 2017. What’s better, B or

One Solution: Graph sampling

● Sampled graph often more desirable than small chunk of original graph

● Makes analysis on large graphs tractable● Can be used for preliminary evaluation

5

Page 6: Perspective Sampling: A Visualization March 28, 2017 ...tmm/courses/547-17/slides/austin-graphsampling.pdf · Presentation by: Austin Wallace. March 28, 2017. What’s better, B or

One more problem: How to sample?

What is the best way to sample?

● Should we pick nodes at random?● Traverse the graph?

6

Page 7: Perspective Sampling: A Visualization March 28, 2017 ...tmm/courses/547-17/slides/austin-graphsampling.pdf · Presentation by: Austin Wallace. March 28, 2017. What’s better, B or

Lots of solutions!

This paper focusses on five of the most widely used:

● Random Node (RN)● Random Edge Node (REN)● Random Walk (RW)● Random Jump (RJ)● Forest Fire (FF)

7

Page 8: Perspective Sampling: A Visualization March 28, 2017 ...tmm/courses/547-17/slides/austin-graphsampling.pdf · Presentation by: Austin Wallace. March 28, 2017. What’s better, B or

What? Why? How?

What:

● Node-link unweighted networks (N: ~1000-20000)

Why:

● Summarize topology

How:

● RN, REN, RW, RJ, FF8

Page 9: Perspective Sampling: A Visualization March 28, 2017 ...tmm/courses/547-17/slides/austin-graphsampling.pdf · Presentation by: Austin Wallace. March 28, 2017. What’s better, B or

Key Question: Perceptual Quality

What are the main factors that affect perceptual quality in a sampled graph?

How are those factors affected by the five sampling strategies?

9

Page 10: Perspective Sampling: A Visualization March 28, 2017 ...tmm/courses/547-17/slides/austin-graphsampling.pdf · Presentation by: Austin Wallace. March 28, 2017. What’s better, B or

Important Perceptual Qualities

Three identified:

10

Page 11: Perspective Sampling: A Visualization March 28, 2017 ...tmm/courses/547-17/slides/austin-graphsampling.pdf · Presentation by: Austin Wallace. March 28, 2017. What’s better, B or

Important Perceptual Qualities

Three identified:

● Coverage Area●

11

Page 12: Perspective Sampling: A Visualization March 28, 2017 ...tmm/courses/547-17/slides/austin-graphsampling.pdf · Presentation by: Austin Wallace. March 28, 2017. What’s better, B or

Important Perceptual Qualities

Three identified:

● Coverage Area● Cluster Quality●

12

Page 13: Perspective Sampling: A Visualization March 28, 2017 ...tmm/courses/547-17/slides/austin-graphsampling.pdf · Presentation by: Austin Wallace. March 28, 2017. What’s better, B or

Important Perceptual Qualities

Three identified:

● Coverage Area● Cluster Quality● High Degree Nodes, and their preservation

In addition, 20% sampling rate was selected as a fair comparison rate

13

Page 14: Perspective Sampling: A Visualization March 28, 2017 ...tmm/courses/547-17/slides/austin-graphsampling.pdf · Presentation by: Austin Wallace. March 28, 2017. What’s better, B or

Graphs used: BA and Sah

14Power law networks generated by a Barabasi-Albert model

Guaranteed cluster networks generated by Sah et al.’s model

Page 15: Perspective Sampling: A Visualization March 28, 2017 ...tmm/courses/547-17/slides/austin-graphsampling.pdf · Presentation by: Austin Wallace. March 28, 2017. What’s better, B or

How did they fare: Coverage Area

● Best: Random Edge Node and Random Jump○ Do not get trapped, but are not as sparse as Random Node

● Random Walk is poorest○ May not explore anywhere near the whole graph, leaving out entire

sections○ Researchers expected Random Node to be poorest

● Forest Fire and Random Walk do better in less modular graphs

15

Page 16: Perspective Sampling: A Visualization March 28, 2017 ...tmm/courses/547-17/slides/austin-graphsampling.pdf · Presentation by: Austin Wallace. March 28, 2017. What’s better, B or

How did they fare: Cluster Quality

16

● Best: Random Edge Node and Random Jump perform best

● Poorest: Random Node and Forest Fire ● Random Walk depends on graph modularity, but not

graph size

Page 17: Perspective Sampling: A Visualization March 28, 2017 ...tmm/courses/547-17/slides/austin-graphsampling.pdf · Presentation by: Austin Wallace. March 28, 2017. What’s better, B or

How did they fare: High Degree Nodes

17

● Best: Random Walk ○ Can visit the same node many times

● Poorest: Random node is consistently poor○ Not at all biased towards high degree nodes

● Random jump does well, but may jump away before fully exploring a high degree node

● Random Edge Nodes is biased towards high degree nodes, so does better

Page 18: Perspective Sampling: A Visualization March 28, 2017 ...tmm/courses/547-17/slides/austin-graphsampling.pdf · Presentation by: Austin Wallace. March 28, 2017. What’s better, B or

So, which is best?

● Random Walk to preserve high-degree nodes● Random Jump or Random Edge Node to preserve

global structure and cluster quality● Almost never use Random Node

18

Page 19: Perspective Sampling: A Visualization March 28, 2017 ...tmm/courses/547-17/slides/austin-graphsampling.pdf · Presentation by: Austin Wallace. March 28, 2017. What’s better, B or

Strengths

● Substantial thought given to experiment design and neutralizing potential confounds

● Depth of work: Pilot study, three formal studies● Useful, well explained, and nuanced recommendations

19

Page 20: Perspective Sampling: A Visualization March 28, 2017 ...tmm/courses/547-17/slides/austin-graphsampling.pdf · Presentation by: Austin Wallace. March 28, 2017. What’s better, B or

Weaknesses and limitations

● Does not explore the laying out of graphs post-sampling.

● Only used computer science students/graduates in their studies

● Single sampling rate was tested

20

Page 21: Perspective Sampling: A Visualization March 28, 2017 ...tmm/courses/547-17/slides/austin-graphsampling.pdf · Presentation by: Austin Wallace. March 28, 2017. What’s better, B or

Potential future work

● Improve metrics based on human feedback● Perceptual quality of graph abstraction, as opposed to

sampling● Investigate time to complete tasks on sampled graphs,

as well as accuracy● Investigate false positives, such as a sampled low

degree perceived as high degree21