weighted flow graphs for statistics edwin de jonge ntts february 2009

16
Weighted Flow graphs for statistics Edwin de Jonge NTTS February 2009

Upload: isaac-reese

Post on 27-Mar-2015

227 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Weighted Flow graphs for statistics Edwin de Jonge NTTS February 2009

Weighted Flow graphs for statistics

Edwin de Jonge

NTTS February 2009

Page 2: Weighted Flow graphs for statistics Edwin de Jonge NTTS February 2009

Statistics and flows

• Many official statistics are flow data– Demography

– Migration– International trade

But also balance systems:

– System of National Accounts (SNA)

– Energy balance

Page 3: Weighted Flow graphs for statistics Edwin de Jonge NTTS February 2009

Statistics and visualisation

• Visualisation exploits visual system to:– Reveal and highlight patterns in data

(trends, correlation, distribution)

• Most common visualisations– line and bar charts

– scatter and bubble plots

– Cartographic choropleth

Page 4: Weighted Flow graphs for statistics Edwin de Jonge NTTS February 2009

Flow visualization

• Many official statistics are flow data– But not presented as flows!

• Flow diagram is weighted directed graph– G = (V,E,w)

– Not many visualisation research for weighted directed graphs

Page 5: Weighted Flow graphs for statistics Edwin de Jonge NTTS February 2009

Flow visualisation (2)

Options– Standard node and edge visualisation

– Not real option: does not encode the weights (= data)

– Sankey diagrams– Very good for energy statistics etc.!

– Cartographic flows– Arrows on a cartographic map

Page 6: Weighted Flow graphs for statistics Edwin de Jonge NTTS February 2009

Cartographic flows

• Flow maps:– Many are hand made

– Flow routing is hard

– Number of flows is limited to 50

– Most are unidirectional

Computer generated cartographic flow layout is still scarce

Page 7: Weighted Flow graphs for statistics Edwin de Jonge NTTS February 2009
Page 8: Weighted Flow graphs for statistics Edwin de Jonge NTTS February 2009

Experiment: large flow map

• Most statistical datasets are large!• Experiment to visualise

– Thousands of flows, that are bidirectional, every flow may have a counter flow

• It should:– give overview of all flows

– show main flows

– reveal flow patterns

Page 9: Weighted Flow graphs for statistics Edwin de Jonge NTTS February 2009

Experiment: Internal migration• Migration between 459 municipalities in the

Netherlands• Migration is matrix M(i,j) i, j = 1..N

• mij= migration from i to j

• Large number of flows and bidirectional

Page 10: Weighted Flow graphs for statistics Edwin de Jonge NTTS February 2009

Experiment: Internal migration• Data summary:

– 60,000 movements (of the 210,000)

– Mean = 10, Max = 2880, Median = 2

= Skewed!

• Technology:– Google Earth, KML file

– Generate arrows as polygons in KML

Page 11: Weighted Flow graphs for statistics Edwin de Jonge NTTS February 2009

Naïve implementation

• Too many arrows• Visual clutter:

– no overview

– no main flows

– no flow patterns

Page 12: Weighted Flow graphs for statistics Edwin de Jonge NTTS February 2009

Naive implementation 2

Page 13: Weighted Flow graphs for statistics Edwin de Jonge NTTS February 2009

Visual encoding

• Use visual encoding to reduce clutter– Arrow

– Width: logarithmic scale– Encodes size of flows

– Transparency: logarithmic scale– Reduces visual clutter

– Height: linear scale– Focus on main flows

Page 14: Weighted Flow graphs for statistics Edwin de Jonge NTTS February 2009

User interaction / Results

• Use user interaction to filter data– user can select regions (no flows)

Results• Clear overview of overall flows

• Main flows are visible• Non local flows are also visible• But no other patterns!

Page 15: Weighted Flow graphs for statistics Edwin de Jonge NTTS February 2009
Page 16: Weighted Flow graphs for statistics Edwin de Jonge NTTS February 2009

Discussion

• Result is ok, but should be further improved– Better user interaction

– GE user interaction very limited

– Select and filter for flows

– Reveal patterns in flow data– Use cluster techniques to group flows

– User cluster techniques to group regions