“people search, watch, and keep in touch”
DESCRIPTION
Sue Moon in collaboration with Yong-Yeol Ahn, Meeyoung Cha, Hyunwoo Chun, Seungyeop Han, Haewoon Kwak, Jon Crowcroft, Hawoong Jeong, Pablo Rodriguez. “People search, watch, and keep in touch”. Alexa.com. 2007.6.26. -- Yong-Yeol Ahn. ``People search, watch, and keep in touch”. - PowerPoint PPT PresentationTRANSCRIPT
Quantitative Analysis of User Behaviors
People search, watch, and keep in touchSue Moon
in collaboration withYong-Yeol Ahn, Meeyoung Cha, Hyunwoo Chun, Seungyeop Han, Haewoon Kwak, Jon Crowcroft, Hawoong Jeong, Pablo Rodriguez
1Alexa.com1Yahoo.com2MSN.com3Google.com4YouTube.com5Live.com6 MySpace.com7Baidu.com8Orkut.com9Wikipedia.org10qq.com2007.6.26.baidu.com google likeqq.com naver like2``People search, watch, and keep in touch
-- Yong-Yeol Ahn3What did we do before Internet?4Remember POTS?POTS = Plain Old Telephone Service5Graham Bells Illustration
6Todays Telephone Network
7People only talked8Predictable Behaviorswhich translates to 9Applicability of same user behavior model over timewhich translates to10Easy planning and Management which translates to11NOW ...12``People search, watch, and keep in touch
-- Yong-Yeol Ahn13Why should computer scientists care?14Why do I care?15``People searchThey submit queries to search enginesQueries reflect collective mind10 most searched keywords
Blog tags also reflect collective mindInfer relations between words from blog tags? [4]
16``People watchNews with still imagesNot watch but browse
VoD (Video On Demand)UCC (User Created Contents) [2]IPTV [3]
17Implications (I)
[5]18Implications (II)Network traffic to grow up to sixfold annuallyCisco CTO
Remember Tech Bubble Burst?
19``People stay in touchEmails and messagesImplicit, not explorable
Social networking servicesExplicit, connection visibleOpportunities for business
20From a computer scientists point of view21``People searchThey submit queries to search enginesQueries reflect collective mind10 most searched keywords
Blog tags also reflect collective mindInfer relations between words from blog tags? [4]
22``People watchNews with still imagesNot watch but browse
VoD (Video On Demand)UCC (User Created Contents) [2]IPTV [3]
23``People stay in touchEmails and messagesImplicit, not explorable
Social networking servicesExplicit, connection visibleOpportunities for business
24I Tube, you tube, everybody tubes
25YouTube SystemLargest VoD for usergenerated contents Founded in Feb 05Some daily statistics - 100M videos served- 65K videos uploaded- 60% of online videos served via YouTube40-50 Gbps bandwidth estimated
26Video Example
OwnerUpload timeRuntimeViewsRatingsStarsCommentsHonorsLinking pages27Content producers, consumers
28Massive files (90%) account for 20% views Small set of files (10%) with 80% of viewsPareto Distribution(max view=8.5M)(max view=2.5M)
(< 1K views)Heavy-tail29Zipf (Power) with exp cutoff
30Popularity Evolution31Age of daily viewed videos
32Watching Television OverNationwide IP Multicast33Quality-assured IPTV architecture
homegatewaySTBPCTV
DSLAM customer premiseTV head end
ISPIP backboneInternet
phone34Internet (1 Mb/s)VoIPIPTV (5 Mb/s)1-2 channelsLast mile(6 Mb/s)
1Gb/s5Mb/s34
Channel holding timeSpikes in histogram: natural long-term off hours?Tipping point in CDF
,Browse View Away3535 Number of viewers over timeTime-of-day effect18% increase in viewing over weekends36
36 Channel popularityTop 10% channels account for 80% viewer shareZipf-like popularity also shown in PPLive37
37 Static vs Dynamic Multicast Trees
cost = 2cost = 1 SourceIP routerDSLAMSTBStaticDynamic3838
Alternate designs for live TV39
Server-based IP multicast
Server-less P2P unicast
Server-basedIP unicastco-existwithinISPHow do these technologies compare?39Example routing
cost = 3cost = 7 TV head endRegionalserverDSLAMSTBCDN Locality-aware P2P Topology-oblivious P2P
40IP routercost = 4
40User ClusteringPeep into life-styles of users using NMF
41
Night Owls25%Always-On50%Early-birds25%Mention we find three-clusters as that was the best heuristic.Besides three, seven was also a good number!41Channel Correlation42233234Docu TVDocumaniaDocumentals65314Nationals2Tele 5Tve 1Antena 3CuatroLa SextaTve 24243Trace TVMTV BaseMusic2Movies116118MGMExtreme TV11111240 LatinoSol musica11040 TVMusic1Analsys of huge online social networking services43CyWorld44MySPACE45Orkut46Online Social Networking ServicesPortal for people to Stay in touch with friendsShare photos and personal newsFind others of common interestsEstablish a forum for discussion4747CyWorldLargest SNS in South KoreaStarted in September 200110 million users in 200416 million users out of 48 million populationFront runner of many featuresFriend (il-chon) relationship GuestbookTestimonial (il-chon-pyung)Photos - scrapsAvatar in cyber home
4848My CyWorld Mini-Homepage49
49CyWorld Data SetsComplete snapshot (Nov 2005)191 million friend relationships between 12 million usersTwo additional snapshots (Apr/Sep 2005)
5050MySpace Data SetLargest in the worldBegan in Jul 2003Has 130 million by Nov 2006
Snowball sampledDuring Sep/Oct 2006Random seed to 100,000 usersAbout 23% of users had friend list hidden
5151Orkut Data SetGoogle SNSBegan in Sep 2002Became official Google service in Jan 2004Began as invitation-only; open nowHas 33 million users
Snowball sampledDuring Jun to Sep 2006100,000 users
5252Metrics of InterestDegree distributionPower-law Small number of nodes have large numbers of linksClustering coefficient C(k)# of existing links / # of all possible links between a links adjacent neighborsClose to 1, close to a meshDegree correlation knnDegree k ~ mean degree of adjacent neighbors of nodes with degree kAssortativity: characteristic of knn distribution5353Assortative Mixing54
M. E. J. Newman, Phys. Rev. Lett. 89, 208701 (2002) Socialnon-social+-
degreeassortative54Questions We RaiseWhat are the main characteristics of online SNSs?How representative is a sample network?How does a social network evolve?
5555Historical Analysis56
56Degree Distribution57
Figure 1-(a): degree distribution, CCDFTwo scaling regions57
Clustering Coefficient Distribution5858Degree Correlation59
Not assortative59
Average Path Length60< 5 is about 90%60
Evolution of Degree Distributions61Two kinds of driving force61
Evolution of Path Length62Start of densification?62How about myspace and orkut?63
Degree Distributions6464What did we learn?65CYWORLD is saturated66but continues to grow67myspace fast growing thru cyber-only relationships68POINTs to POnDER69Ease of data collection70Complete data rather than sampled set71Am I asking all the questions?72Or are there many more?73``People search, watch, and keep in touch
-- Yong-Yeol Ahn74Alexa.com1Yahoo.com2MSN.com3Google.com4YouTube.com5Live.com6 MySpace.com7Baidu.com8Orkut.com9Wikipedia.org10qq.com2007.6.26.baidu.com google likeqq.com naver like75Web N.0: What sciences will it take? -- Prabhakar Raghavan76Where do I go from here?77References[1] Ahn et al., Analysis of Topological Characteristics of Huge Online Social Networks, WWW 2007[2] Cha et al., I tube, you tube, everybody tubes: analyzing the worlds largest user generated content video system, ACM SIGCOMM IMC 2007 (best paper award)[3] Cha et al., Watching television over nationwide IP multicast under submission[4] Kwak et al., Constructing word relationships from tags in preparation[5] Willinger et al., Scaling phenomena in the Internet: Critically examining criticality, PNAS, vol 99, suppl. 1
78