on managing continuous media data edward chang hector garcia-molina stanford university
Post on 20-Dec-2015
215 views
TRANSCRIPT
On Managing Continuous Media Data
On Managing Continuous Media Data
Edward Chang Hector Garcia-MolinaStanford University
2
Challenges
Large Volume of DataMPEG2 100 Minute Movie: 3-4 GBytes
Large Data Transfer RateMPEG2: 4 to 6 MbpsHDTV: 19.2 Mbps
Just-in-Time Data RequirementSimultaneous Users
3
...Challenges
Traditional Optimization Objectives:Maximizing Throughput!Maximizing Throughput!!Maximizing Throughout!!!
How about Cost?How about Initial Latency?
4
Related Work
USC (S. Ghandeharizadeh)UCLA (R. Muntz)UBC (Raymond Ng)Bell Labs. (B. Ozden)IBM Tom Watson Labs. (P. Yu)etc.
5
OutlineServer (Single Disk)
Revisiting Conventional Wisdom Minimizing CostMinimizing Initial Latency
Server (Parallel Disks)Balancing WorkloadMinimizing Cost & Initial Latency
ClientHandling VBRSupporting VCR-like Functions
6
Conventional Wisdom(for Single Disk)
Reducing Disk Latency leads to Better Disk Utilization
Reducing Disk Latency leads to Higher Throughput
Increasing Disk Utilization leads to Improved Cost Effectiveness
7
Is Conventional Wisdom Right?
Does Reducing Disk Latency lead to Better Disk Utilization?
Does Reducing Disk Latency lead to Higher Throughput?
Does Increasing Disk Utilization lead to Improved Cost Effectiveness?
8
Tseek: Disk Latency
TR: Disk Transfer Rate
DR: Display RateS: Segment Size (Peak Memory Use per Request)T: Service Cycle Time
9
S = DR × T
T = N × (Tseek + S/TR)
10
N × TR × DR × Tseek
TR - N × DR
S is directly proportional to Tseek
=
Dutil =S/TR
S/TR + Tseek
S
Dutil is Constant!
Disk Utilization
11
Is Conventional Wisdom Right?
Does Reducing Disk Latency lead to Better Disk Utilization? NO!
Does Reducing Disk Latency lead to Higher Throughput?
Does Increasing Disk Utilization lead to Improved Cost Effectiveness?
12
What Affects Throughput?
Disk Latency
Memory Utilization
Disk Utilization
Throughput
×
?
13
Memory Requirement
We Examine Two Disk Scheduling Policies’ Memory RequirementSweep (Elevator Policy): Enjoys
the Minimum Seek OverheadFixed-Stretch: Suffers from High
Seek Overhead
14
N × TR × DR × Tseek
TR - N × DR =S
Per User Peak Memory Use
15
Sweep (Elevator)
Disk Latency: MinimumIO Time Variability: Very High
16
Sweep (Elevator)
Memory Sharing: PoorTotal Memory Requirement:
2 * N * Ssweep
17
Fixed-Stretch
Disk Latency: High (because of Stretch)IO Variability: No (because of Fixed)
18
Fixed-Stretch
Memory Sharing: GoodTotal Memory Requirement:
1/2 * N * Sfs
19
Throughput
Sweep2 * N * Ssweep
Available Memory = 40 Mbytes
N = 40
Fixed Stretch1/2 * N * Ssf
Available Memory = 40 Mbytes
N= 42Higher Throughput
* Based on A Realistic Case Study Using Seagate Disks
20
What Affects Throughput?
Disk Latency
Memory Utilization
Disk Utilization
Throughput
×
?
21
Is Conventional Wisdom Right?
Does Reducing Disk Latency lead to Better Disk Utilization? NO!
Does Reducing Disk Latency lead to Higher Throughput? NO!
Does Increasing Disk Utilization lead to Improved Cost Effectiveness?
22
Per Stream Cost
23
Cm × N × TR × DR × Tseek
TR - N × DR =Cm × S
Per-Stream Memory Cost
24
Example
Disk Cost: $200 a unit Memory Cost: $5 each MBytes Supporting N = 40 Requires 60 MBytes Memory
$200 + 300 = $500 Supporting N = 50 Requires 160 MBytes
Memory$200 + 800 = $1,000
For the same cost $1,000, it’s better to buy 2 Disks and 120 Mbytes to support N = 80 Users!
Memory Use is Critical
25
Is Conventional Wisdom Right?
Does Reducing Disk Latency lead to Better Disk Utilization? NO!
Does Reducing Disk Latency lead to Higher Throughput? NO!
Does Increasing Disk Utilization lead to Improved Cost Effectiveness? NO!
26
So What?
27
OutlineServer (Single Disk)
Revisiting Conventional Wisdom Minimizing CostMinimizing Initial Latency
Server (Parallel Disks)Balancing WorkloadMinimizing Cost & Initial Latency
ClientHandling VBRSupporting VCR-like Functions
28
Initial Latency
What is it?The time between when a request arrives
at the server to the time when the data is available in the server’s main memory
Where is it important?Interactive applications (e.g., video
game)Interactive features (e.g., fast-scan)
29
Sweep (Elevator)
30
Fixed-Stretch
Space Out IOs
31
Fixed-Stretch
32
Fixed-Stretch
33
Our Contribution: BubbleUp
Fixed-Stretch Enjoys Fine Throughput
BubbleUp Remedies Fixed-Stretch to Minimize Initial Latency
34
Schedule Office Work
8am: Host a Visitor9am: Do Email10am: Write Paper11am: Write PaperNoon: Lunch
35
BubbleUp
36
BubbleUp
Empty Slots are Always Next in Time
No additional Memory RequiredFill the Buffer up to the Segment Size
No additional Disk Bandwidth RequiredThe Disk Is Idle Otherwise
37
Evaluation
38
Fast-Scan
39
Fast-Scan
40
Data Placement Policies
Please refer to our publications
41
42
Chunk Allocation
Allocate Memory in ChunksA Chunk = k * S
Replicate the Last Segment of a Chunk in the Beginning of Next Chunk
ExampleChunk 1: s1, s2, s3, s4, s5Chunk 2: s5, s6, s7, s8, s9
43
Chunk Allocation
Largest-Fit FirstBest Fit (Last Chunk)
44
18 Segment Placement
45
Largest-Fit First
46
Best Fit
47
OutlineServer (Single Disk)
Revisiting Conventional Wisdom Minimizing CostMinimizing Initial Latency
Server (Parallel Disks)Balancing WorkloadMinimizing Cost & Initial Latency
ClientHandling VBRSupporting VCR-like Functions
48
Unbalanced Workload
49
Balanced Workload
50
N × TR × DR × Tseek
TR - N × DR =S
Per Stream Memory Use (Use M Disks Independently)
M × N
51
Per Stream Memory Use (Use M Disks As One Disk)
M × N
52
N × TR × DR × Tseek
TR - N × DR =S
S’ =N × M × TR × M × DR × Tseek
TR × M - N × M × DR
S’ = M × N × TR × DR × Tseek
TR - N × DR= M × S
…Continue
53
Challenges
Using M Disks Independently:Unbalanced WorkloadLow Per-Stream Memory Cost
Using M Disks As One Virtual Disk (i.e., Employing Fine-Grained Striping):Balanced WorkloadHigh Per-Stream Memory Cost
54
Our Approach (2DB)
Use Disks IndependentlyTo Minimize Cost
Replicate Hot Movies (20% Movies)To Balance Workload
Use BubbleUpTo Minimize Initial Latency
55
2D BubbleUp (2DB)
Intelligent Data PlacementEfficient Request SchedulingFODO, 1998
56
2DB Data Placement: Chunk Allocation
57
2DB Scheduling
Formally, This is a Bipartite Weighted Matching problemCan be solved using Hungarian method
in O(V^3), where V = NMWe use a Greedy Method to reduce
the problem to a Bipartite Unweighted Matching problemCan be solved in O(M^2)
58
Why 2DB Works?
59
60
61
n balls n urns, finite n:
ln n / ln ln n(1 + o(1))
ln ln n / ln 2 + O(1)
m balls n urns, m > n and infinite m and n:
d: number of possible destinations
ln ln n / ln d (1 + o(1)) + O(m/n)
62
What 2DB Costs?
Storage CostAddition disk cost = % hot moviesTypically 20% of movies subscribed
80% of timeThroughput
Throughput is scaled back by a fraction to achieve balanced work
63
Evaluation
2DB Achieves Balanced Workload with High ThroughputCompared to e.g., some dynamic load
balancing schemes 2DB Incurs Low Additional Storage
Cost2DB Enjoys Minimum Initial Latency
64
OutlineServer (Single Disk)
Revisiting Conventional Wisdom Minimizing CostMinimizing Initial Latency
Server (Parallel Disks)Balancing WorkloadMinimizing Cost & Initial Latency
ClientHandling VBRSupporting VCR-like Functions
65
Media Client
Most Studies Assume Dumb ClientsWe Propose Smart Clients for
Handling VBRSupporting VCR-like Functions
66
Handling VBR
Server Can Handle VBRFrame rate fluctuates but the moving
average does not fluctuate as muchRates are even out when N is large,
which is typically the case
67
...VBR
But, the Server Cannot Eliminate Bitrate MismatchPacketization and Channel Delay
can change the bitrateThe Solution Must Be at the Client
Side!
68
Supporting VCR-like Functions
Pause Phone call interruptionsBiological needs
Fast ForwardCatching up the program after a pause
Instant Replay
69
How to Pause A Movie?
Broadcast TV Cannot Be PausedPausing Via a Point-to-point Link
Affects the Server’s Scheduling
Caching!!!Main Memory Caching?
Too expensive! (19.2 mbps * 20 min = 2 GBytes)
70
Buffer Management
71
Challenges
Must Ensure Arriving Bits Do Not Overflow the Network Buffer
Must Ensure Decoder Buffer Does Not Underflow
Must Work for Any Off-the-shelf Disks, CPU Box
72
Our Contribution: MEDIC
MEDIC: MEmory & Disk Integrated Cache
MEDIC Manages IOs Between Memory and Disk Efficiently Only 4 Mbytes main memory needed!!!Make a set-top box affordable
MEDIC Adapts to Hardware Configuration
73
Demo
Regular PlaybackPauseResume Regular PlaybackFast ForwardInstant Replay (not shown)
74
Visualize MEDIC
75
Conclusions (Contributions in Blue)
Server (Single Disk)Revisiting Conventional Wisdom Minimizing CostMinimizing Initial Latency
Server (Parallel Disks)Balancing WorkloadMinimizing Cost & Initial Latency
ClientHandling VBRSupporting VCR-like Functions
76
…Conclusions
Our Server SupportsLow Latency Playback and Fast Forward
Our Client SupportsPause and Low Latency Instance Replay
Together, We Propose A Complete End-to-end Solution for Continuous Media Data Delivery!
77
Future Work
Enhancing MEDIC for Managing Heterogeneous Data, from Both Broadcast & Internet ChannelsVideo PanoramasInteractive TV
Indexing Videos for ReplayVideo/Image databases