social network analysis (1) ling 575 fei xia 01/04/2011

12
Social Network Analysis (1) LING 575 Fei Xia 01/04/2011

Upload: cornelia-turner

Post on 01-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Social Network Analysis (1)

LING 575Fei Xia

01/04/2011

Basic idea

• Build a graph– A node represents a person– A link represents the relation between two persons– Question: define what kind of relation should be used

• Process the graph to answer questions such as – what is the structure of the graph– who is a key player in the graph

• Let’s start with paper #4, (Diesner and Carley, 2005), “Exploration of Communication Network from the Enron Email Corpus”

(Diesner and Carley, 2005)

• Research questions:– What are the structure and properties of the

communication networks in Enron? How do these features relate to other networks?

– Who are key players or critical individuals in the system?

– How do structure and key players change over time?

Dataset

• Start with the ISI database– 252,759 emails from 151 people

• Database refinement– Add job position and job location info• there are 15 unique job titles (CEO, president, VP, etc.)

– Normalize email addresses• on average, each person has 1.9 email addresses

Communication network

Oct 2000 (160 agents) Oct 2001 (174 agents)

Degree centrality

• Given a graph G=(V,E) with n vertices,

• in-degree centrality:

• out-degree centrality:

Closeness centrality

• Loosely, Closeness is the inverse of the average distance in the network between the node and all other nodes.

• If every node is reachable from v

Betweenness centrality• Loosely, across all node pairs, the percentage that has a shortest

path that passes through v.

• sum = 0;• For each pair of vertices (s,t)

compute all the shortest paths between s and t determine the fraction of shortest paths that go through v sum += fraction;

• betweenness = sum / X; X is (n-1)(n-2)/2 for undirected graph, and (n-1)(n-2) for directed

graph

Key players per centrality measures

Key players per centrality measures

Email exchange per month

Emails sent to positions