securemr: a service integrity assurance framework for mapreduce author: wei wei, juan du, ting yu,...

Post on 29-Dec-2015

238 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

SecureMR: A Service Integrity Assurance Framework for MapReduce

Author: Wei Wei, Juan Du, Ting Yu, Xiaohui Gu

Source: Annual Computer Security Applications Conference, 2009, pp.73-82.

Presenter: Tsuei-Hung Sun (孫翠鴻 )

Date: 2010/9/17

2

Outline

• Introduction

• Motivation

• Contribution

• Scheme

• Security analysis

• Performance evaluation

• Comment

3

Introduction

• MapReduce– A parallel data processing model to simplify parall

el data processing on large clusters.

– Proposed by Google.

– It is mainly running on clusters belonging to a single administration domain.

Yahoo’s Hadoop

– Amazon Elastic Compute Cloud (EC2) and Amazon Simple Storage Service (Amazon S3).

4

Introduction

Fig. The MapReduce data processing reference model.

M1.

M2. M3.

R1.

R2. R3.

(Distributed File System)

5

Introduction

Fig. Combine multiple map and reduce phases.

6

Introduction

• Data processing service integrity Replication-based techniques

– Sampling techniques

– Checkpoint-based verification

7

Motivation

• Existing address the service integrity, but not on data processing service.

• Replication-based techniques drawback – Replicate all distributed computing tasks for

consistency verification is not efficiency.

– Not scalable to perform centralized consistency verification over massive result data.

8

Contribution

• Decentralized replication-based integrity verification for MapReduce in open systems.

• Achieves security: non-repudiation, resilience to DoS attacks and replay attacks.

• Security components can be easily integrated into existing MapReduce implementations.

• Low performance overhead.• The first attempt to address data processing servi

ce.

10

Scheme

• SecureMR - Architecture Design

11

Scheme

• SecureMR - Communication Design

Commitment protocol

Verification protocol

12

Scheme

• Commitment Protocol

IDMap: a monotonically increasing identity of a map task. DataLoc: input data block location. sig: Master’s signature. KpubM: Mapper’s public key. sigM: Mapper’s signature.HP1,…,HPr: hash value for each partition of its intermediate result

SchedulerTask Executor

Commit Manager

13

Scheme• Verification Protocol

Pi: partition of intermediate results that the reducer will process. ADM: Mapper’s address. HPi: Pi partition committed by the Committer. ReqSeq: sequence number.

Task Executor

Manager

Scheduler

Verifier

CommitterVerifier

Committer VerifierManager

Verifier

sigR

14

Scheme

• Extension for Reducers and MapReduce Chain

MapPhase

MapPhase

ReducePhase

ReducePhase

VerifyPhase

Add Verifier componentAdd Committer component

15

Security analysis

• Collusive Attack - Attacker behavior analysis– Periodical Attacker

• Naive attacker

• Without collusion attacker

• With collusion attacker

– Strategic Attacker

16

Security analysis

Fig. Detection Rate for Non-Collusion Naive Attacker.

Fig. Detection Rate for Non-Collusion Periodical Attacker.

b = 20; Pm = 1 b = 20; Pm = 0.5

b : block number of one input job. Pm: misbehaving probability.l: misbehavior of mapper is detected when he do number of jobs.

17

Security analysis

Fig. Detection Rate for CollusionPeriodical Attacker.

Fig. Misbehaving Probability vs.Duplication Rate.

n : total worker number. m: malicious workers

n = 50; Pm = 0.5; b=20; l = 15n = 50; b =20; l = 15

18

Performance evaluation

T: time D: data transmission cost. r: number of reducers.

19

Performance evaluation

Fig. Response Time vs. Numberof Reduce Tasks. Fig. Response Time vs. Data Size.

number of map task = 60; Data Size = 1GB number of map task = 60;number of reduce task =25

20

Performance evaluation

Fig. Response time vs. Duplication Rate.Fig. Response time vs. Number of Reduce Tasks.

number of map task = 60; Data Size = 1GB

21

Comment

• Assign and Notify can combine into one step.

• TicketM contain some parameters are the same as reducer sign part in request massage.

• If first request is failure, how can reducer do? (TicketM and ReqSeq how to renew)

• In Response massage, mapper can sign Data together that can avoid one hash and reducer also didn’t need to check it.

top related