storage availibility in large scale data centers

23
INTELLIGENT DATA OUTSOURCING INTELLIGENT DATA OUTSOURCING An Improved Storage Availibility in Large Scale Data Centers An Improved Storage Availibility in Large Scale Data Centers

Upload: marybabu10

Post on 19-Aug-2015

73 views

Category:

Engineering


6 download

TRANSCRIPT

Page 1: Storage availibility in large scale data centers

INTELLIGENT DATA OUTSOURCINGINTELLIGENT DATA OUTSOURCINGAn Improved Storage Availibility in Large Scale Data CentersAn Improved Storage Availibility in Large Scale Data Centers

Page 2: Storage availibility in large scale data centers

2

OUTLINEOUTLINE

● Introduction

● Existing System

● Proposed System

● Analysis

● Conclusion

● References

Page 3: Storage availibility in large scale data centers

3

INTRODUCTIONINTRODUCTION

Big data

● broad term for large datasets

● Business technology for modern enterprises

● Accuracy leads to redused risks

● Storage is a critical component

● Stored in disks

Page 4: Storage availibility in large scale data centers

4

STORAGE SYSTEMS TASKSSTORAGE SYSTEMS TASKS

● High priority foreground tasks

● Low priority background tasks

Storage system Tasks

Foreground tasks Background tasks

Page 5: Storage availibility in large scale data centers

5

LOW PRIORITY BACKGROUND TASKSLOW PRIORITY BACKGROUND TASKS

● RAID Reconstruction

● RAID Resynchronisation

● Disk Scrubbing

Page 6: Storage availibility in large scale data centers

6

INEFFICIENCIES OF EXISTING SYSTEMINEFFICIENCIES OF EXISTING SYSTEM

● Time consuming

● Data loss

● Inefficient Storage Availibility

● Failure Induced optimization

● Does not exploit the predictable nature

● Passive

Page 7: Storage availibility in large scale data centers

7

INTELLIGENT DATA OUTSOURCINGINTELLIGENT DATA OUTSOURCING

● Dynamically captures data popularity

● Exploits temporal and spatial access locality

● Balance between background tasks workflow and user I/O requests

● Portable

Page 8: Storage availibility in large scale data centers

8

OPTIMIZATION SCHEMEOPTIMIZATION SCHEME

EXISTING SYSTEM

● Reactive Optimization

● Request Based

● Exploits Temporal Locality

PROPOSED SYSTEM

● Proactive Optimization

● Zone Based

● Exploits both temporal and spatial locality

Page 9: Storage availibility in large scale data centers

9

OPTIMIZATIONOPTIMIZATION

Reactive Optimization

● Starts after the crash

● Passive

Proactive Optimization

● Starts before the crash

● Active

Page 10: Storage availibility in large scale data centers

10

ACCESS LOCALITYACCESS LOCALITY

Temporal locality

● Repeated data access within small time

● Request Based Optimization

Spatial locality

● Clustered data access within small storage areas

● Zone Based Optimization

Access Locality

Temporal locality Spatial locality

Page 11: Storage availibility in large scale data centers

11

CHARACTERISTICS OF IDOCHARACTERISTICS OF IDO

● Proactive Zone Based Optimization

● Temporal Locality

● Spatial Locality

● User I/O

● Background I/O

Page 12: Storage availibility in large scale data centers

12

DESIGN OF IDODESIGN OF IDO

MAIN THREE OBJECTIVES

● Improving the storage availibility

● Improving the I/O performance

● Providing high portality

Page 13: Storage availibility in large scale data centers

13

IDO ARCHITECTUREIDO ARCHITECTURE

Page 14: Storage availibility in large scale data centers

14

IDO FUNCTIONAL MODULESIDO FUNCTIONAL MODULES

Hot Zone Identifier

Data Migrator

Request Distributor

Task Predictor

DataReclaimer

● Hot Zone Identification

● Task Prediction

● Request Distribution

● Data Migration

● Data Reclamation

Page 15: Storage availibility in large scale data centers

15

KEY DATA STRUCTURESKEY DATA STRUCTURES

ZONE_TABLE

● Num

● Popularity

● Flag

D_MAP

● D_offset

● S_offset

● Len

Page 16: Storage availibility in large scale data centers

16

HOT DATA IDENTIFICATIONHOT DATA IDENTIFICATION

THREE DESIGN ISSUES

● By exploiting the spatial locality of workloads

● By exploiting the temporal locality of requests

● By implementing intelligent modules datastructures

Page 17: Storage availibility in large scale data centers

17

PROACTIVE DATA MIGRATIONPROACTIVE DATA MIGRATION

● Hot zone identified

● Task Predictor detects task

● Data Migrated

● Flag set to 01

● RAID Reconstructed

● Flag set to 10

● RAID Reclaimed

● Corresponding D_map deleted

Page 18: Storage availibility in large scale data centers

18

IMPROVED STORAGE AVAILIBILITY FOR I/OIMPROVED STORAGE AVAILIBILITY FOR I/O

I/O read request

● IDO determines target data zone

● Read request issued to degraded/surrogate RAID set

● Popularity updated

● IDO checks D_map

I/O write request

● Checks D_map for write request hits

● D_map updated

● Sequentially written to surrogate RAID set

Page 19: Storage availibility in large scale data centers

19

DATA CONSISTENCYDATA CONSISTENCY

TWO ASPECTS CONSIDERED

● Key data structures

● Redirected write data on surrogate RAID

Page 20: Storage availibility in large scale data centers

20

ANALYSISANALYSIS

Overhead Analysis

● Performance Overhead

● Memory Overhead

Page 21: Storage availibility in large scale data centers

21

CONCLUSIONCONCLUSION

● Proactive optimisation accelerates low priority background tasks

● Zone Based approach boosts the performance of low priority background tasks

● Designed and implemented a proactive zone based optimisation to outsource data

Page 22: Storage availibility in large scale data centers

22

REFERENCESREFERENCES

● S. Wu, H. Jiang, D. Feng, L. Tian, and B. Mao. Proactive Data Migration for Improved Storage Availability in Large-Scale Data Centers. IEEE Transactions on Computers, 2015.

● S. Wu, H. Jiang, D. Feng, L. Tian, and B. Mao. Improving Availability of RAID-Structured Storage Systems by Workload Outsoucing. IEEE Transactions on Computers, 2011.

● S. Wu, B. Mao, D. Feng, and J. Chen. Availability-Aware Cache Management with Improved RAID Reconstruction Performance. In CSE’10, Dec. 2010.

● L. Xiang, Y. Xu, John C. S. Lui, and Q. Chang. Optimal Recovery of Single Disk Failure in RDP Code Storage Systems. In SIGMETRICS’10, Jun. 2010.

Page 23: Storage availibility in large scale data centers

23

THANK YOU!!!THANK YOU!!!