data replication and power consumption in data grids susan v. vrbsky, ming lei, karl smith and jeff...
TRANSCRIPT
Data Replication and Power Consumption in Data Grids
Susan V. Vrbsky, Ming Lei, Karl Smith and Jeff ByrdDepartment of Computer Science The University of AlabamaIEEE 2010 Cloud Computing Technology and Science
March 16, 2011Taikyoung Kim
SNU IDB Lab.
2
Outline Introduction Data Replication Performance Results Conclusion and Future Work
3
Introduction
Data grid features– Millions of files are generated and thousands of clients access
the files– Need to manage an extremely large number of data sets
Present systems support scalability, but extremely en-ergy inefficient– Power and cooling of the data center are inefficient – The power demanded by data centers is predicted to double
from 2006 to 2011 Storing, managing and moving massive amounts of
data are also a significant bottleneck
4
Introduction Our approach
– Save energy through the use of efficient CPU usage– Consider strategies to minimize disk storage and data
transmission
We propose to minimize the amount of data stored by utilizing smart replication strategies– Consider replicating the data only when necessary
Goal– Design data aware strategies for data-intensive computing
Shorter running times Decreased amount of data transmitted Smaller storage space
– Reduce power needed
5
Outline Introduction Data Replication
– Data Grid Architecture– Sliding Window Strategy
Performance Results Conclusion and Future Work
6
Data Replication
Utilize data replication – High probability to access data which is not in the local
site– Remote data file access can be a very expensive opera-
tion Network bandwidth, network congestion
– It reduces the access time and avoids remote file ac-cess
limit size of the storage– To decrease the amount of energy needed to store the
data
Use of smart data replication to reduce the cost of accessing and storing data
7
Data Replication
Data Grid Architecture
We consider only single-tier grids– Expect the strategies developed for single-tier grids can
be usedwithin the multi-tier structure
It is common for a job in a data grid to list all the files needed to complete its task– We utilize this aspect in designing a data replication
scheme
8
Data Replication
Sliding Window Strategy
SWIN [Sliding Window replica scheme]
– Consider the file access times in the future and local site Storage Element size
– Build a “sliding window” that is a set of distinct files which will be used immediately in the future
Includes all the files the current job will access and the distinct files from the next arriving jobs
The sum of the files in the sliding window will be at most the size of the local Storage Element
– Slides forward on more file each time the system finishes processing one file
Keep changing in this way
9
Data Replication
Sliding Window Strategy
Q=<J1,J2….> : a set of jobs
FAS(Ji)=<fi1,fi2….,fik> : file accessing sequence (fin≠fim)
G_FAS=<FAS(J1),FAS(J2),…,FAS(Jn)> : global file accessing se-quence
POS(fx,G_FAS): return the first position of fx in G_FAS Sliding Window rules
1. The sum of the sizes of all the files in the sliding window ≤ Size(SE)2. No duplicated files exist in the sliding window3. Any files in the sliding window will not be in a position before the
POS(fK,G_FAS)
4. Any files not in the sliding window will be in a position after POS(fm,G_FAS)
10
Outline Introduction Data Replication Performance Results
– Performance Environment– Number of Nodes Powered On– File Availability
Conclusion and Future Work
11
Performance Results Evaluate the performance of SWIN replica strategy
using Sage-built at the University of Alabama
Sage nodes– Intel D201GLY2 mainboard with 1.2 GHz Celeron CPU
On-board 10/100 Megabit LAN
– 1 Gb 533 MHz RAM– 80 Gb SATA 3 hard drive
Energy usage rates– Booting and peak : 430 Watts– Idle : 335 Watts (Cooling fans turned on)
315 Watts (Cooling fans turned off)
12
Performance Results
Performance Environment
The client nodes are responsible for – Processing the request– Maintaining replica copies – Notifying the server when a job is completed
Default experiment parameters
Metric– Total running time– Average number of watts required to process a job
Sampled every 1 minute
(400MB)
13
Performance Results
Number of Nodes Powered On
The power consumed is affected by whether or not all of the nodes are powered on – Regardless of whether they are being used in the computa-
tion of the jobs
LFU -Least Frequently UsedLRU -Least Recently UsedMRU -Most Recently Used
14
Performance Results
Number of Client Nodes
Measured the total running time for 100 jobs with all nodes powered on
15
Performance Results
Number of Client Nodes
While LRU requires the most watts, it has a shorter running time overall than LFU and MRU– Does not require the highest number of watts
The jobs with only 1 or 2 client nodes take longer to run than those utilizing 8 client nodes
The watts required for computation is a smaller per-centage of the total watts
16
Performance Results
File Availability The files are only available at the server
– (a) The jobs are able to run in a shorter amount of time as clients in-crease
– (b) The bottleneck increases as the number of client nodes increases Assume all file requests must go through the resource broker at the
server
The amount of power consumed is not always strictly re-lated to the running time of the jobs
Lastly, have shown that the window size can be decreased without increasing the running time or power consumed
17
Outline Introduction Data Replication Performance Results Conclusion and Future Work
18
Conclusion and Future Work Propose the smart strategies for replication files
– One way to minimize the energy consumed in data grid SWIN strategy
– Minimize the amount of data transmitted and storage needed
– Performs better than existing strategies, such as LRU, MRU and LFU
– Particularly beneficial in power saving when resource con-tention is high
– Decrease running time and watts required Smaller storage can be used to lower the amount of power
Future work– Study the performance of SWIN when the files are of differ-
ent sizes– Explore more efficient implementations for transferring files– Design and test additional replica schemes by utilizing the
CPU – Consider ways to schedule the jobs
Thank you
Question?