1 milena mihail [email protected] web science tea feb 29, 08 discussion topic:
TRANSCRIPT
What is Web Science ?
Includes some intersection of comp sci, economics, social sci.
Our grassroots discussions :
Microsoft:New Cambridge LabJennifer Chayes
Yahoo: Raghavan WWW06 Brachman GT talk
Chris Klaus GT talk
NSF : CDI
Elsewhere :
Our non grassroots discussions :Super-Duper Data Center, ala Jeanette WingShould revisit this point, in view of NSF-Google-IBM ?
What is Web Science ?
The study of the WWW, broadly defined.By virtue of the pervasiveness of the object of study.
Systems-like science (like chemistry or biology).As opposed to “computer science” which is the study of “computation”,biology is the study of “life” from the cell to evolution to animals….
Should be studied in terms of its descriptive/predictive/explanatory/prescriptive analytic value.
Parenthesis: MSN SemGrail 07
Why should there be Web Science ?
Encourage collaboration across different areas.Something between the union and intersection of several areas.Need to establish common vocabulary, goals, problems.“Understanding the elephant versus the tail trunk”.
Educate students for industry.
Encourage academia to understand the study of the Web as a discipline.
Parenthesis: MSN SemGrail 07
Themes cutting across subareas of Web science
Long Tails / Economics / Culture
Fractal Nature, multi-scale
Humans and machines interact and interactions registered.New dimension in social sciences.
Transformed way we think about information(analogy to introduction of printing press).Democracy of information,producers and consumers of information coincide.
Dynamics, emergent systems, social networksRequires new analytics (eg what are right logics, probabilistic and approximation metrics)
Parenthesis: MSN SemGrail 07
What is Web Science ?
Includes some intersection of comp sci, economics, social sci.
Our grassroots discussions :
(in this spirit)
Outline: Wide Range of Models Canonical Example: Modeling Small World PhenomenonModel Parameters/Metrics and their RelevanceModels : Structural Explanatory (Optimization or Incentive Driven) HybridWhich question are you (am I) trying to answer?
Range of Models
Internet (general) Routing Internet
AS Level
RoutingLevel
(nice pictures with some meaning)
few long linksin a flat world
Sparse Power Law Graphswith very different assortativity
Range of Models
Patent / co-author networkin Boston area
(nice pictures with some meaning)
Flickr social networkfrom Flickrsearch keyword “graph”
notice bottleneck bad cut
notice no botlleneck bad cut
( Range of Flickr Pictures - meaning ? )
Technology PlatformsLocal Facebook Friendship Graph
A Wep Page Organization
4 Color Theorem
Range of Mathematical Models
Rick Durrett, Cornell, Probabilist
Mat
thew
Jack
son,
Sta
ford
, Eco
nom
ist
n
Canonical Example: Modeling the Small World Phenomenon
Milgram’s Experiment 60’s :Even though relationships are highly clustered,most people are pairwise reachable via short paths,“Six Degrees of Separation” (for fun, see also Facebook group)
Strogatz&Watt’s Model 80’s:In a clustered graph of size n,a few random linksdecrease the diameter to logn.
Clustering and Small Diameter
Kleinberg 90’s: Navigability !These short paths can be found efficiently with local search!
14
Kleinberg’s navigability model
Theorem: The only value for which the network is navigableis r =2.
Are there natural network models which are navigable and have, eg, power-law degree distributions ?
Are there natural models where the threshold is not sharp ?
Model Parameters/Metrics (as a function of n) and their Relevance
Average degree and Degree distribution
Clustering coefficient (small dense subgraphs)
Diameter
Expansion/Conductance (bottlenecks)
Eigenvalues, eigenvectors (quantify bottlenecks and find groups efficiently)
eg in Prediction / Simulationeconomics engineering
Evolving toward monopolies/oligopolies?
Can it be searched, crawled efficiently?
Can pagerank be computer efficiently?Can it route with low congestion?Does it support efficient info retrieval?How does information/technology spread?
Important to have FLEXIBLE network models
Assortativity
Structural / Macroscopic ModelsRandom graphs with desirable graph properties, thought to be aggregating all microscopic primitives
Example 1: Power Law Random Graph
Given Choose random perfect matching over
Example 2: Growth & Preferential Attachment
One vertex at a time
New vertex attaches to
existing vertices
Some evolutionary random graph models may also capture more factors,e.g, geography, and hence varying conductance.
Example 2, generalization towards flexibility:
Explanatory / Microscopic Models / Optimization Driven
Example: HOT, evolutionary, new node attaches by minimizing cost and maximizing quality of service
Point: Optimization primitivescan yield power law distributions.
Explanatory / Microscopic Models / Incentive Driven
Example: A Network Formation Game
How fast can such a stable configuration be reached?
24
SUMMARY
It is important to identify critical metrics and parameters ie, how they impact network performance.It is important to develop models where critical parameters vary and flexible network models.
It is important to identify network primitives related to optimization and incentives.It is important to develop mechanisms that affect such primitives.