talk on spotify: large scale, low latency, p2p music-on-demand streaming

40
Spotify - Large Scale, Low Latency, P2P Music-on-Demand Streaming Presenter: Y.S.Horawalavithana Authors: Gunnar Kreitz (KTH, Sweden) Fredrik Niemelä (KTH, Sweden)

Upload: yasanka-sameera-horawalavithana

Post on 19-Jul-2015

287 views

Category:

Technology


0 download

TRANSCRIPT

Spotify - Large Scale, Low Latency, P2P Music-on-Demand

StreamingPresenter: Y.S.Horawalavithana

Authors: Gunnar Kreitz (KTH, Sweden)

Fredrik Niemelä (KTH, Sweden)

Talk

• Spotify

• Taxonomy

• Protocol

• P2P Overlay

• Evaluation

• Discussion

IEEE 10th International Conference on P2P computing 2

What is Spotify?

• Peer assisted on-demand music streaming

• Hybrid Platform• Client/ Server• P2P• Pub/Sub

• Large catalog of music (Over 8 million tracks)

• 24 million user base

• Available as Web, Desktop & Mobile clients

IEEE 10th International Conference on P2P computing 3

Talk

• Spotify

• Taxonomy

• Protocol

• P2P Overlay

• Evaluation

• Discussion

IEEE 10th International Conference on P2P computing 4

Taxonomy – Related Work

IEEE 10th International Conference on P2P computing 5

On-Demand Music

Streaming

Client/Server

P2P Overlay

On-Demand Video

Streaming

On-Demand File

Sharing

Structured Unstructured

Problem Statement

How to implement & adopt a peer-assisted custom network protocol for

on-demand streaming?

IEEE 10th International Conference on P2P computing 6

Talk

• Spotify

• Taxonomy

• Protocol

• P2P Overlay

• Evaluation

• Discussion

IEEE 10th International Conference on P2P computing 7

Spotify Protocol: Overview

• Network protocol • Designed for streaming music

• Audio Stream• Encoding

• Ogg Vorbis: quality q5 (160 kbps) or q9 (320 kbps)

• Bit-rate• 96-320 kbps

• Simple Design!!!

IEEE 10th International Conference on P2P computing 8

Spotify Protocol: Design

• Design Goals • Simplicity• Upload whole track unless DO NOT!!

• Reliability• TCP instead UDP

• Why?? • TCP’s congestion control

• Re-sending of lost packets

IEEE 10th International Conference on P2P computing 9

Playback-latency

Stutter Disturb

Spotify Protocol: Caching

• Importance• Individual benefit: User wise• Group benefit: Peer wise in p2p overlay

• Default setting• 10% of free disk space• User can configure too, 1-100 GB

• Caches are large!!• 56% are over 5GB

• Cache eviction: Least Recently Used• But?

IEEE 10th International Conference on P2P computing 10

I Feel like I’ve head all

of this before…

Spotify Protocol: Streaming a Track

• Request first piece from Spotify servers• Already open TCP Connection

• Meanwhile, search for peers with track

IEEE 10th International Conference on P2P computing 11

Spotify Protocol: Streaming a Track (Contd.)

• Server: Better reliable source• Already initiated TCP connection between client & server

• No repeat 3-way handshake

• Long live, but bursty

• TCP congestion control, e.g. TCP New Reno, Cubic

• Cheated!!• By a kernel configuration

• Against RFC 5681

• But achieved success - Reduced average playback latency

IEEE 10th International Conference on P2P computing 12

P2P: Ok, Why am I really here?Client: Please, ask it from my

play-out buffer?

IEEE 10th International Conference on P2P computing 13

Spotify Protocol: Streaming a Track (Contd.)

• Where to stream? Server or P2P• Local decision depends on

• the amount of data in client’s play-out buffers

• When buffers are sufficient, only download from P2P

• “Emergency mode”• Buffers become critically low (less than 3 seconds of audio buffered during

playback)

• Stop uploading data to peers

IEEE 10th International Conference on P2P computing 14

Spotify Protocol: Streaming a Track (Contd.)

• Given track,• Simultaneously request from both server & P2P network

• If a peer is slow: resend to another peer• Which peer?

• Greedily select peer who has less download time• 16KB chunks

• sequentially

• Within at most 2 second

• If urgent?

IEEE 10th International Conference on P2P computing 15

Spotify Protocol: Random vs. Regular Access

Random track selection

• Choses a new track to be played

• Jumps into the middle of a track

• 39% playbacks

Predictable track selection

• Next track as current finishes

• Press the Forward

• 61% playbacks

IEEE 10th International Conference on P2P computing 16

Spotify Protocol: Predictable track selection

• Prefetching• Towards end of a track, start prefetching next one

• But when?• Too late

• Too early

• Prefetch the next track, when (T-t); Duration of track (T)• t=30; search P2P network

• t=10; server

IEEE 10th International Conference on P2P computing 17

Spotify Protocol: Play-out Delay

• Minimize latency while avoiding stutter

• TCP throughput varies• Sensitive to packet loss• Bandwidth over wireless mediums vary

• Model throughput as• Markov chain and,• Simulate the playback of the track

• Heuristics• Packet delay variation, • Packet loss, and • TCP congestion control

IEEE 10th International Conference on P2P computing 18

Talk

• Spotify

• Taxonomy

• Protocol

• P2P Overlay

• Evaluation

• Discussion

IEEE 10th International Conference on P2P computing 19

P2P: Ok, Why am I really here?Server: Scalability by Load

balancing & Bandwidth utilization

IEEE 10th International Conference on P2P computing 20

P2P Overlay

• Unstructured overlay (not a Distributed Hash Table)• “No” super nodes

• Direct coupled equal peers

• Split overlay network• Weakly clustered by interest

IEEE 10th International Conference on P2P computing 21

P2P Overlay: Split Overlay

IEEE 10th International Conference on P2P computing 22

P2P Overlay: Locating Peers

• How to locate peers who has the specific track?• Looks for and connects

new peers when streaming new track

• No overlay routing

• Two mechanisms1. Tracker based

2. Query the overlay

IEEE 10th International Conference on P2P computing 23

C

A

“Diamond”-Rihanna

B

“Paradise”-ColdPlay

D

“Summer”-Calvin

“Summer”-Calvin

E

“Mirror”-JT

P2P Overlay: Tracker

• Similar to BitTorrent, but not Identical

• Server-side tracker• Only remembers 20 peers per track

• Returns 10 (online) peers to client on query

IEEE 10th International Conference on P2P computing 24

Track Peer Online Flag

Diamond A Yes

Paradise B Yes

Summer D Yes

Diamond F No

Paradise C Yes

Summer E Yes

…. … …

SpotifyServer

P2P Overlay: Query the overlay

• Similar to Gnutella

• Broadcast query • (2 hops) neighborhood in

overlay

• Priority of “Interest” queries• L1: Currently streaming track

• L2: Prefetching next track

• L3: Offline synchronization

IEEE 10th International Conference on P2P computing 25

C

A

B

D

“Summer”-Calvin

E

F

Cached Track:“Summer”-

Calvin

P2P Overlay: Neighbor Eviction

• Neighbor eviction by heuristic evaluation of utility1. Bytes sent in the last 10 minutes

2. Bytes sent in the last 60 minutes

3. Bytes received in the last 10 minutes

4. Bytes received in the last 60 minutes

5. The number of peers found through searches sent over the connection in the last 60 minutes

6. The number of tracks the peer has that the client has been interested in in the last 10 minutes

IEEE 10th International Conference on P2P computing 26

P2P Overlay: NAT Traversal

• NO NAT Traversal• Is it about TCP?

• How to mitigate lack of NAT Traversal?1. “Symmetric” establishment of the connection

2. Universal Plug n’ Play (UPnP) protocol

IEEE 10th International Conference on P2P computing 27

Talk

• Spotify

• Taxonomy

• Protocol

• P2P Overlay

• Evaluation

• Discussion

IEEE 10th International Conference on P2P computing 28

Evaluation: Sources of measurements

• Collected measurements 23–29 March 2010

• Connection statistics• Clients sent report on every 30 minutes

• Raw log messages• Log server + Hadoop cluster

• Open source “Munin” monitoring system

IEEE 10th International Conference on P2P computing 29

How well does Spotify work?

IEEE 10th International Conference on P2P computing 30

Evaluation: Data Sources

IEEE 10th International Conference on P2P computing 31

Evaluation: Tradeoff

• Tradeoff• Server load vs. Playback latency vs. Stutter frequency

• Playback latency = Network latency + (Network latency) + Decoding

IEEE 10th International Conference on P2P computing 32

Digital Rights Management

(DRM)

Request Track over the network

Evaluation: Playback latency

• Median latency: 265 ms

• 75th percentile: 515 ms

• 90th percentile: 1047 ms

• We saw, 56% data comes from the cache

• Playback latency• DRM + local processing

IEEE 10th International Conference on P2P computing 33

Evaluation: Stutter Frequency

• Below 1% of playbacks had stutter occurrences

• Due to Local CPU effects

IEEE 10th International Conference on P2P computing 34

Evaluation: Popularity of track access

• 60% of catalog was accessed

• 88% of track playbacks were within the most popular 12%

• 79% of server requests were within the most popular 21%

IEEE 10th International Conference on P2P computing 35

Evaluation: Locating Peers

IEEE 10th International Conference on P2P computing 36

Evaluation: Protocol Overhead

IEEE 10th International Conference on P2P computing 37

Evaluation: Churn

IEEE 10th International Conference on P2P computing 38

• Comparing with data sources graph earlier, data delivery• not severely impacted by clients

logging out (Evening)

• daily dip (Morning)

Future Work

• Play-out strategy adapted to P2P streaming

• User satisfaction metrics

• Music-on-demand streaming

• Specialized overlays exploiting similarity in taste

IEEE 10th International Conference on P2P computing 39

Thank You!!

IEEE 10th International Conference on P2P computing 40