an empirical study of flash crowd dynamics in a p2p-based live video streaming system bo li, gabriel...

An Empirical Study of Flash Crowd Dynamics in a P2P-based Live Video Streaming System

Bo Li, Gabriel Y. Keung, Susu Xie, Fangming Liu, Ye Sun, and Hao Yin

Email: [email protected] Kong University of Science & Technology

Dec 2, 2008 @ IEEE GLOBECOM, New Orleans

Overview: Internet Video Streaming Enable video distribution from any place to anywhere in the

world in any format

Cont. Recently, significant deployment in adopting Pee

r-to-Peer (P2P) technology for Internet live video streaming

Protocol design: Overcast, CoopNet, SplitStream, Bullet, and etc.

Real deployment: ESM, CoolStreaming, PPLive, and etc.

Key

•Requires minimum support from the infrastructure

•Greater demands also generate more resources: Each peer not only downloading the video content, but also uploading it to other participants

Easy to deploy

Good scalability

Challenges

Real-time constraints, requiring timely and sustained streaming delivery to all participating peers

Performance-demanding, involving bandwidth requirements of hundreds of kilobits per second and even more for higher quality video

Large-scale and extreme peer dynamics, corresponding to tens of thousands of users simultaneously participating in the streaming with highly peer dynamics (join and leave at will)

especially flash crowd

Real-time constraints

Performance-demanding

Large-scale and extreme peer dynamics

Motivation

Flash crowd

A large increase in the number of users joining the streaming in a short period of time (e.g., during the initial few minutes of a live broadcast program)

Difficult to quickly accommodate new peers within a stringent time constraint, without significantly impacting the video streaming quality of existing and newly arrived peers Different from file sharing

Challenge: Large-scale & extreme peer dynamicsCurrent P2P live streaming systems still suffer from potentially long startup delay & unstable streaming qualityEspecially under realistic challenging scenarios such as flash crowd

Focus

Cont. Little prior study on the detailed dynamics of P2P live streaming s

ystems during flash crowd and its impacts E.g., Hei et al. measurement on PPLive, the dynamic of user population during t

he annual Spring Festival Gala on Chinese New Year

How to capture various effects of flash crowd in How to capture various effects of flash crowd in P2P live streaming systems?P2P live streaming systems?

What are the impacts from flash crowd on What are the impacts from flash crowd on user experience & behaviors, and system scale?user experience & behaviors, and system scale?

What are the rationales behind them?What are the rationales behind them?

Outline System Architecture

Measurement Methodology

Important Results Short Sessions under Flash Crowd User Retry Behavior under Flash Crowd System Scalability under Flash Crowd

Summary

Some Facts of CoolStreaming SystemCoolStreaming

Cooperative Overlay Streaming First released in 2004 Roxbeam Inc. received USD 30M investment,

current through YahooBB, the largest video streaming portal in Japan

Download 2,000,000

Average online user 20,000

Peak-time online user 150,000

Google entries (keyword: Coolstreaming)

400,000

CoolStreaming System Architecture

Stream Manager

Partner Manager

Member Manager

BM

Se

gm

en

ts

Membership manager Maintaining partial view of

the overlay: gossip

Partnership manager Establishing & maintaining

TCP connections (partnership) with other nodes

Exchanging the data availability: Buffer Map (BM)

Stream manager Providing stream data to

local player Making decision where

and how to retrieve stream data

Hybrid Push & Pull

Mesh-based (Data-driven) Approaches No explicit structures are constructed and maintained

e.g., Coolstreaming, PPLive

Data flow is guided by the availability of data Video stream is divided into segments of uniform length, availa

bility of segments in the buffer of a peer is represented by a buffer map (BM)

Periodically exchange data availability info with a set of partners (partial view of the overlay) and retrieves currently unavailable data from each other

Segment scheduling algorithm determines which segments are to be fetched from which partners accordingly

Overhead & delay: peers need to explore the content availability with one another, which is usually achieved with the use of gossip protocol

Measurement Methodology

3 types of status report QoS report

% of video data missing the playback deadline

Traffic report Partner report

4 events of each session Join event Start subscription event Media player ready event

receives sufficient data to start playing

Leave event

Each user reports its activities & internal status to the log server periodicallyUsing HTTP, peer log compacted into parameter parts of the URL string

Log & Data Collection

Real-world traces obtained from a live event broadcast in Japan Yahoo using the CoolStreaming system

A sport channel on Sept. 27, 2006 (24 hours) Live baseball game broadcast at 18:00 Stream bit-rate is 768 Kbps 24 dedicated servers with 100 Mbps connection

s

How to capture flash crowd effects? Two key measures

Short session distribution Counts for those that either fail to start viewing a

program or the service is disrupted during flash crowd

Session duration is the time interval between a user joining and leaving the system

User retry behavior To cope with the possible service disruption often

observed during flash crowd, each peer can re-connect (retry) to the program

Short Sessions under Flash CrowdFilter out normal sessions (i.e., users who successfully join the program) Focus on short sessions with the duration <= 120 sec and 240 sec

No. short session increases significantly at around 18:00 when flash crowd occurs with a large number of peers joining the live broadcast program

Strong Correlation Between the Number of Short Sessions and Peer Joining Rate

What are the rationales behind these observations? Relevant factors:

User client connection fault Insufficient uploading capacity from at least one of the

parents Poor sustainable bandwidth at beginning of the stream

subscription Long waiting time (timeout) for cumulating sufficient

video content at playback buffer

Newly coming peers do not have adequate content to share with others, thus initially they can only consume the uploading capacity from existing peers

With partial knowledge (gossip), the delay to gather enough upload bandwidth resources among peers and the heavy resource competition could be the fundamental bottleneck

Approximate User Impatient TimeIn face of poor playback continuity, users either reconnect or opt to leave

Compare the total downloaded bytes of a session with the expected total playback video bytes according to the session durationExtract sessions with insufficient download bytes

The avg. user impatient time is between 60s to 120s

User Retry Behavior under Flash CrowdRetry rate: count the NO. peers that opt to re-join to the overlay

with same IP address and port per unit time

Users could have tried many times to successfully start a video sessionAgain shows that flash crowd has significant impact on the initial joining phase

User perspective: playback could be restoredSystem perspective: amplify the join rates

System Scalability under Flash Crowd Media player ready

Received sufficient data to start playing

Successfully joined

The gap illustrates “catch up process”

Media player ready rate picks up when the flash crowd occurs and increases steadily; however, the ratio between these two rates <= 0.67

Imply that the system has capability to accommodate a sudden surge of the user arrivals (flash crowd), but up to some maximum limit

Media Player Ready Time under different time period

Considerably longer during the period when the peer join rate is higher

Scale-Time Relationship System perspective:

Though there could be enough aggregate resources brought by newly coming peers, cannot be utilized immediately

It takes time for the system to exploit such resources i.e., newly coming peers (with partial view of overlay) need

to find & consume existing resources to obtain adequate content for startup and contribute to others

User perspective: Cause long startup delay & disrupted streaming

(thus short session, retry, impatience)

Future work: System scale???

•Long startup delay•Short continuity

Amount of initialbuffering

Summary

Based on real-world measurement, capture flash crowd effects

The system can scale up to a limit during the flash crowd

Strong correlation between the number of short sessions and joining rate

The user behavior during flash crowd can be best captured by the number of short sessions, retries and the impatient time

Relevant rationales behind these findings

Future work

Modeling to quantify and analyze flash crowd effects

Correlation among initial system capacity, the user joining rate/startup delay, and system scale?

Intuitively, a larger initial system size can tolerate a higher joining rate

Challenge: how to formulate the factors and performance gaps relevant to partial knowledge (gossip)?

Based on the above study, perhaps more importantly for practical systems, how can servers help alleviate the flash crowd problem, i.e., shorten users’ startup delays, boost system scaling?

Commercial systems have utilized self-deployed servers or CDN Coolstreaming, Japan Yahoo, 24 servers in different regions that allowed users t

o join a program in order of seconds PPLive is utilizing the CDN services

On measurement, examine what real-world systems do and experience

On technical side, derive the relationship betweenExpected Numberof Viewers???Amount of

Server Provisioning

along with their joining behaviors

Further, how servers are geographically distributed

References "Inside the New Coolstreaming: Principles, Measurements and

Performance Implications," B. Li, S. Xie, Y. Qu, Y. Keung, C. Lin, J. Liu, and X. Zhang, in Proc. of IEEE INFOCOM, Apr. 2008.

"Coolstreaming: Design, Theory and Practice," Susu Xie, Bo Li, Gabriel Y. Keung, and Xinyan Zhang, in IEEE Transactions on Multimedia, 9(8): 1661-1671, December

2007

"An Empirical Study of the Coolstreaming+ System," Bo Li, Susu Xie, Gabriel Y. Keung, Jiangchuan Liu, Ion Stoica, Hui

Zhang, and Xinyan Zhang, in IEEE Journal on Selected Areas in Communications, 25(9):1-13,

December 2007

Q&A

Thanks !

Additional Info & Results

Comparison with the first release The initial system adopted a simple pull-based scheme

Content availability information exchange using buffer map Per block overhead Longer delay in retrieving the video content

Implemented a hybrid pull and push mechanism Pushed by a parent node to a child node except for the first block Lower overhead associated with each video block transmission Reduces the initial delay and increases the video playback quality

Multiple sub-stream scheme is implemented Enables multi-source and multi-path delivery for video streams

Gossip protocol was enhanced to handle the push function Buffer management and scheduling schemes are re-designed to deal

with the dissemination of multiple sub-streams

Gossip-based Dissemination

Gossip protocol - used in BitTorrent Iteration

Nodes send messages to random sets of nodes Each node does similarly in every round Messages gradually flood the whole overlay

Pros: Simple, robust to random failures, decentralized

Cons: Latency trade-off

Related to Coolstreaming Updated membership content Multiple sub-streams

Multiple Sub-streams

Video stream is divided into blocks

Each block is assigned a sequence number

An example of stream decomposition

Adoption of the gossip concept from P2P file-sharing application

Buffering

Synchronization Buffer Received block firstly put into Syn. Buffer for corresponding sub-stream Blocks with continuous sequence number will be combined

Cache Buffer Combined blocks are stored in Cache Buffer

Comparison with the 1st release (II)

Comparison with the 1st release (III)

Parent-children and partnership

Partners are connected with TCP connections

Parents are supporting video streams to children by TCP connection

System Dynamics

Peer Join and Adaptation

Stream bit-rate normalized to ONE

Two Sub-streams Weight of node is

outgoing bandwidth Node E is newly

arrival

Peer Adaptation

Peer Adaptation in Coolstreaming

Inequality (1) is used to monitor the buffer status of received sub-streams for node A If this inequality does not hold, it implies that at least one sub-stream

is delayed beyond threshold value Ts

Inequality (2) is used to monitor the buffer status in the parents of node A If this inequality does not hold, it implies that the parent node is

considerably lagging behind in the number of blocks received when comparing to at least one of the partners, which currently is not a parent node for the given node A

User Types Distribution

Contribution Index

Conceptual Overlay Topology

Source node “O”

Super-peers{A, B, C, D}

Moderate-peers {a}

Casual-peers {b, c, d}

Event Distributions

Media Player Ready Time under different time period

Session Distribution

an empirical study of flash crowd dynamics in a p2p-based live video streaming system bo li, gabriel...

Documents

flash crowd slide

flash crowd user

internet video streaming

flash crowd summary

flash crowd system scalability

largest video streaming

unstable streaming quality

motivation flash