quicksync: improving synchronization efficiency for mobile cloud storage services yong cui, zeqi...
TRANSCRIPT
QuickSync: Improving Synchronization Efficiency for Mobile Cloud Storage Services
Yong Cui, Zeqi Lai, Xin Wang*, Ningwei Dai, Congcong Miao
Tsinghua UniversityStony Brook University*
1
Outline
Background
Measurement & Analysis
Design & Implementation
Evaluation
Conclusion
The way we store data…
Mobile Cloud Storage Services (MCSS)
Data entrance of Internet– Major players: Dropbox, Google Drive, OneDrive, Box, …
– Large user base: Dropbox has more than 300 million users
– More and more mobile devices/data: MCSS helps to manage data across your multiple devices
Basic function of MCSS– Storing, sharing, synchronizing data from anywhere, on any
device, at anytime via ubiquitous cellular/Wi-Fi networks
Architecture & Capabilities of MCSS
Architecture– Control server: metadata management; Storage server: contents– Sync process with your multiple clients
Sync efficiency is one of the most important thing for MCSS.
Architecture & Capabilities of MCSS
Key capabilities– Chunking: splitting a large file into multiple data units with certain size – Bundling: multiple small chunks as a single chunk– Deduplication: avoiding the retransmission of content already available
in the cloud– Delta-encoding: only transmitting the modified portion of a file
Can data be synchronized efficiently?
Ideal– These capabilities MAY efficiently synchronize our data in
mobile/wireless environments (high delay & intermittent connections)
Reality– The sync time is much longer than expected with various network
conditions!
Challenges of improving sync efficiency– These capabilities are useful or enough? What’s their relationship?
– Novel angle of view: storage techniques & network techniques
– Close source & encrypted traffic—hard to identify the root cause
Aim of our work
We identify, analyze and address the sync inefficiency
problem of modern mobile cloud storage services.
Outline
Background
Measurement & Analysis
Design & Implementation
Evaluation
Conclusion
Measurement Methodology
Understand the sync protocol– Methodology: trace app encrypted traffic– Decryption for in-depth analysis: hijack SSL socket of Dropbox– Three sync/upload stages: sync preparation, data sync, sync finish
Arrangement of our measurement
Exp. Capabilities Dropbox Google Drive
OneDrive
Seafile
- Chunking 4MB 8MB Var. Var.
1 Dedup. ✔ ✗ ✗ ✔
2 Delta encoding
✔ ✗ ✗ ✗
3 Bundling ✔ ✗ ✗ ✗
Identifying the sync inefficiency problem
Redundancy deduplication– Seafile eliminates more redundancy than Dropbox on a same data set
– Seafile uses smaller content-define chunk size than Dropbox
– But Seafile may need more sync time even with reduced traffic!
– Finding more redundancy needs more CPU time
DER: deduplicated file
size/the original file size
Root cause analysis
Why less data cost more time?
– Chunking is closely related to deduplication
– Fixed-size is efficient in good network conditions
– Aggressive chunking helps in high delay environments
– Trade-off between traffic amount and CPU time with various
network conditions
– Network-based chunking may be important
Identifying the sync inefficiency problem
Dropbox fails on incremental sync with delta encoding– 3 operations (flip bits, insert, delete) over continuous bytes of a
synced test file
– Insert 2MB at head of a 40MB file, but 20MB transmitted
TUO: generated traffic size /expected traffic size
Root cause analysis
Why the incremental sync fails– A large file is split into multiple chunks– Delta-encoding is only performed between mapped chunks – Chunking is the basis for delta-encoding
Identifying the sync inefficiency problem
Incremental sync failure for continuous modification– Modify files continuously, which are under the transmissions
process to the cloud (temporary files of MS word or VMware)– Incremental sync fails with long delay network
Root cause analysis– The metadata is updated only after chunks are successfully uploaded. – A chunk under sync process cannot be used for delta-encoding.
TUO: generated traffic size
/size of revised file
Identifying the sync inefficiency problem
Bandwidth inefficiency– Synchronize files differing in size – Sync is not efficient for large # of
small files in high RTT conditions– BUE: measured throughput /
theoretical TCP bandwidthRoot cause analysis
– Client waits for ack from server before transmit next chunk– Sequential ack for each small chunk and even too many new
connections bear the TCP slow start, especially with high RTT– Bundling is quite important
Identifying the sync inefficiency problem
Bandwidth inefficiency– Bundling small chunks is important
(only dropbox implements)– Bandwidth utilization decreases
with large files Root cause analysis
– Dropbox uses up to 4 concurrent TCP connections
– Client waits for all of 4 connections to finish before next iteration
– Several connections may be idle between two iterations
Outline
Background and Architecture
Measurement & Analysis
Design & Implementation
Evaluation
Conclusion
QuickSync: improving sync efficiency
System overview with 3 techniques– Propose network-aware content-defined chunker to identify redundant
data – Design improved incremental sync approach that correctly performs
delta-encoding between similar chunks to reduce sync traffic– Use delay-batched ack to improve the bandwidth utilization
Design detail: Network-aware Chunker
Network-aware chunker– Intuitively, prefer larger chunks when the bandwidth is sufficient,
and make aggressive chunking in slow networks
– Predefine several chunking strategies in QuickSync
– Client selects a chunking strategy for a file to optimize sync time
Segment_size*cwnd/RTT
Byte comp. time
Data size
Total comp. time
Total comm. time
Design detail: Network-aware Chunker
Virtual chunks in the server– A file needs to sync with multiple clients in various networks– Avoid storing multiple copies of a file with various chunk methods– Propose virtual chunk to only store the offset and length which can
be used to generated the pointers to the content– Support multi-granularity deduplication without additional copies
Design detail: Redundancy eliminator
Two-step sketch-based mapping– Hash match: two chunks are identical
– Sketch match: two chunks are similar
– Performing delta-encoding between two most similar chunksBuffering uncompletely synced chunks on client
– Chunks waiting to be uploaded or performed delta-encoding are temporarily buffered in local device
– So incremental sync works in the middle of a sync process (for the use cases of MS Word or VMware temp files)
Design detail: Batched Syncer
Batched transmission– Defer the app-layer ack to the end of sync process, and
actively check the un-acked chunks upon the connection interruption
– Check will be triggered: when the client captures a network exception, or time out
Reuse existing network connections– Reuses the storage connection to transfer multiple chunks,
avoiding the overhead of duplicate TCP/SSL handshakes
QuickSync implementation
Implementation over Dropbox– Unable to directly implement
on close-source Dropbox– Design proxy-based architecture– Implement our functionality between
client on Galaxy Nexus & proxy in EC2– Leverage Samplebyte for chunking and librsync for delta encoding
Implementation over Seafile– The proxy architecture adds additional overhead– Full implementation with Seafile, an open source system, on both
client & server
Outline
Background and Architecture
Measurement & Analysis
Design & Implementation
Evaluation
Conclusion
Impact of the Network-aware Chunker
Setup– Data set: 200GB backup– RTT setting: 30~600ms
Performance– Up to 31% sync speed
improvementCPU overhead
– Client: <12.3%– Server: <0.5% (per user)
Results show– Network-aware chunker
works well
Impact of the Redundancy Eliminator
Setup & Results– Conduct the same set of experiments, where TUO = 5-10– Traffic utilization overhead decreases to 1 (insert)– Results show: Sketch-based mapping works well– QuickSync only synchronizes new contents under arbitrary
number of modifications: local buffering works well
Impact of the Batched Syncer
Setup– Uploading files differing in size
Performance – Bandwidth utilization improves up
to 61% (better with high RTT)– Delayed ack works well
Exception recovery– Delayed ack MAY lead high
overhead on exception recovery– Result of active check: overhead is
less than Seafile and Dropbox
Performance of the Integrated systemSetup
– Practical sync workloads on Windows / Android– Source code backup, MS Powerpoint editing, data backup, photo
sharing, …Performance
– Traffic size reduction: up to 80.3% / 63.4% (Win / Android)– Sync time reduction: up to 51.8% / 52.9% (Win / Android)
Outline
Background and Architecture
Measurement & Analysis
Design & Implementation
Evaluation
Conclusion
Conclusion
Measurement study & sync inefficiency analysis– The basic capabilities are quite important for sync efficiency
– They are far away to perform well
– Network parameters are also importantQuickSync: design, implementation, evaluation
– Three key techniques
– Implementation over Dropbox & Seafile
– Practical sync workloads show the efficiency on both traffic size reduction and sync time reduction
Welcome to IETF
Standard sync protocol– Third-party apps are easy to develop– APIs will be unnecessary or at least simplified– Sync or file sharing among different services is available– Easy to improve Internet storage services
Many interests from IETF– We introduced this idea in IETF 93 in July 2015– IETFer: good topic for a new working group
Our mail list: [email protected]: https
://github.com/iss-ietf/iss/wiki/Internet-Storage-Sync
Thank you!
Questions?
Server-side storage overhead
Server-side overhead: measurement & analysis– Using smaller chunks will increase the metadata overhead
(# of chunks)– The additional overhead of QuickSync is only a small fraction
of the overall storage cost
very small fraction
Related work
Measurement study– Measurement & performance comparison for enterprise cloud
storage service [IMC’10]
– Measurement & benchmarking for personal cloud storage [IMC’12] [IMC’13]
– Sync traffic waste [IMC’14]
– Our measurement identify and reveal the root cause of sync inefficiency in wireless networks
Related work
Storage system– Enterprise data backup system [FAST’ 14]
– Data consistency in sync services [FAST’ 14]
– Reducing sync overhead [Middleware’ 13]
– QuickSync focuses on performance in wireless networks, and introduce network-related consideration
Related work
CDC and delta encoding– Content defined chunking [FAST’ 08] [FAST’ 12] [INFOCOM’ 14]
[SIGCOMM’ 00]
– Delta encoding: rsync, https://rsync.samba.org/
– QuickSync utilizes CDC addressing for a unique purpose, adaptively selecting the optimized chunk size to achieve sync efficiency
– We also improve the delta encoding and address the limitation in MCSS