virtualization technique for replica synchronization

22
Virtualization Technique For Replica Synchronization By : Ashwin G.Sancheti Email:[email protected] Instructor : Prof.Randal Burns Date : 19 th Feb 2008

Upload: others

Post on 18-Dec-2021

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Virtualization Technique For Replica Synchronization

Virtualization Technique

For Replica Synchronization

By :

Ashwin G.Sancheti

Email:[email protected]

Instructor : Prof.Randal Burns

Date : 19th Feb 2008

Page 2: Virtualization Technique For Replica Synchronization

Roadmap

� Motivation/Goals

� What is Virtualization?

� Advantages?

� My Previous work� My Previous work

� Architecture and design

� Algorithm and working

� Pros and cons for my work

� New proposed idea

� Different approaches for the same

� Conclusion

Page 3: Virtualization Technique For Replica Synchronization

Problem and Motivation

� Redundancy in Replica Synchronization

� To transfer the redundant information which might already present the receiver side.

Time required to transfer the data is very � Time required to transfer the data is very high.

� Disk space usage

� More I/O

e.g.: Patch update example

Page 4: Virtualization Technique For Replica Synchronization

Goal

� To extract the common part (Delta) part between two virtual machine and transfer only the common part with minimum transmission delay.delay.

� Reduce the I/O traffic

� Reduce the disk space

Page 5: Virtualization Technique For Replica Synchronization

Two Solutions:

� TAPER Algorithm

Replayfs trace (VFS approach)� Replayfs trace (VFS approach)

Page 6: Virtualization Technique For Replica Synchronization

Non virtual machine and VM

Page 7: Virtualization Technique For Replica Synchronization

Advantages

� Server Consolidation

� Testing and development

� Dynamic Load Balancing � Dynamic Load Balancing

� Disaster Recovery

� Resource sharing

� Reduce power consumption

� Reduce the land requirement

� Many others…

Page 8: Virtualization Technique For Replica Synchronization

My Work

� Current Problem :� Large Disk Space to store multiple virtual machines.

Goal :� Goal :� To get the delta part between two virtual machines and delete one virtual machine.

� Save lots of space.

� Example :� Windows 2K Server (Plain Vanilla)

� Windows 2K + Exchange Server

Page 9: Virtualization Technique For Replica Synchronization

Architecture and Design

� VMDK Architecture

� Sparse header

� Descriptor table� Descriptor table

� Grain Directories

� Grain tables

� Grain Size

� CRC computation

� Algorithm to compute the delta part

Page 10: Virtualization Technique For Replica Synchronization

Test cases and Results

� VM 1 : Win 2000 (Plain)� Size :1 GB

� VM 2 : Win 2000 + Exchange Server� VM 2 : Win 2000 + Exchange Server� Size : 1.3 GB

� Delta VM: Only Exchange Server� Size : 120 MB

Page 11: Virtualization Technique For Replica Synchronization

New proposal

� Goal :� To get the Delta/Common part between the client machine and the remote machine with minimum transmission delay.with minimum transmission delay.

� Advantages� No need to transmit whole virtual machine.

� Less time to transmit the data.

Page 12: Virtualization Technique For Replica Synchronization

First Approach

� TAPER Algorithm� Directory tree synchronization protocol between source and target node

� Works in four phases� Works in four phases� Directory tree

� Large chunks

� Smaller blocks

� Bytes

Page 13: Virtualization Technique For Replica Synchronization

First Approach(Conti..)

� Directory Matching

� Eliminates identical portions of the directory tree that are common in content and structure

� Hierarchical hash tree implementation

Page 14: Virtualization Technique For Replica Synchronization

First Approach(Conti…)

� Matching Chunks� Now we are left with unmatched files at the source and Target.

� Use content-defined chunking (CDC) to � Use content-defined chunking (CDC) to reduce the unmatched data.

� Boundary for chunk is defined by Rabin Fingerprinting (?)

� Target send the SHA values for all the remaining files to the source

Page 15: Virtualization Technique For Replica Synchronization

First Approach(Conti…)

� Matching Blocks� Each file at the source will be series of matched

and unmatched regions (holes)

� Fine grained block matching is performed so we � Fine grained block matching is performed so we will left with unmatched data blocks at source side.

� Matching Bytes� The blocks in the unmatched data are delta

encoded with similar blocks in matched set.

� Finally remaining unmatched data and delta bytes are sent to the target using standard compression algorithm.

Page 16: Virtualization Technique For Replica Synchronization

Second Approach

� Perform the comparison at the VFS level.

� Replayfs : Replaying file system traces at the VFS level.

Main goal : To reproduce the original file � Main goal : To reproduce the original file system workload as accurately as possible

Page 17: Virtualization Technique For Replica Synchronization

Replayfs Component

� Raw traces and re-playable traces

� Trace compiler : To get the raw traces

� Command

Sequence of VFS operations with their � Sequence of VFS operations with their associated timestamps, process ID

� Actual return value is compared with the return value captured in the original trace.

Page 18: Virtualization Technique For Replica Synchronization

Replayfs (Conti…)

� Resource Allocation Table (RAT)� To refer to the command parameters and the return values.

� Always kept in the memory.

Page 19: Virtualization Technique For Replica Synchronization

Replayfs(Conti…)

� Memory buffers� Largest component of the Replayfs trace

� Necessary to replay the trace.

� Includes file name and buffers to be written in future.

Page 20: Virtualization Technique For Replica Synchronization

Conclusion

� Detecting common part between two virtual machine reduces the disk space drastically.

� TAPER is one of the algorithm to get the common part.common part.

� Another approach we can choose as capturing traces at VFS level(ReplayFS)

Page 21: Virtualization Technique For Replica Synchronization

References

� TAPER: Tiered Approach for Eliminating Redundancy in Replica Synchronization

University of Texas Austin

Accurate and Efficient Replaying of File System � Accurate and Efficient Replaying of File System Traces

Stony Brook University

Page 22: Virtualization Technique For Replica Synchronization

THANK YOU

Any questions?