distributed services scheduling and cloud provisioning

31
Distributed Services Scheduling and Cloud Provisioning Project Mentor - Mr. Shashank Srivastava Department of Computer Science and Engineering Motilal Nehru National Institute of Technology Allahabad, Allahabad

Upload: ar-agarwal

Post on 23-Jan-2015

145 views

Category:

Engineering


7 download

DESCRIPTION

This is the presentation for my final year project at NIT Allahabad (2013-14). The purpose of the project is to design a scheduling algorithm for cloud environment with proper resource management.

TRANSCRIPT

Page 1: Distributed Services Scheduling and Cloud Provisioning

Distributed Services Scheduling and Cloud Provisioning

Project Mentor - Mr. Shashank Srivastava

Department of Computer Science and Engineering Motilal Nehru National Institute of Technology Allahabad, Allahabad

Page 2: Distributed Services Scheduling and Cloud Provisioning

Project Members

Name Registration Number

Prashant Mishra 20105094

Nishant Narang 20104085

Sonu Goel 20103030

Trishala Saini 20104119

Arjit Agarwal 20094044

Page 3: Distributed Services Scheduling and Cloud Provisioning

Introduction to Cloud Computing?

In Computer Science, Cloud Computing is a synonym for distributed computing over a network, and means the ability to run a program or application on many connected computers at the same time , i.e. , network-based services, which appear to be provided by real server hardware, and are in fact served up by virtual hardware, simulated by software running on one or more real machines.

Page 4: Distributed Services Scheduling and Cloud Provisioning

Clients

Control Node

DataBase(Storage)

Computer Network

Application Servers

A Simplified Cloud Computing Architecture

Cloud

Page 5: Distributed Services Scheduling and Cloud Provisioning

SaaS

Cloud Service Models

PaaS

IaaSINFRASTRUCTURE AS A SERVICE

IaaS is the delivery of technology infrastructure as an on demand scalable service. Offers resources like virtual machine disk image library, block or file-based storage, firewalls, load balancers, virtual LANs, Software bundles, etc. Cloud providers typically charge the customers IaaS services on the amount of resources allocated and consumed.

Page 6: Distributed Services Scheduling and Cloud Provisioning

SaaS

Cloud Service Models

PaaS

IaaSPLATFORM AS A SERVICE

PaaS providers make available to the client a computational platform including typically an OS, programming language, execution environment, etc. Clients need to pay only a nominal fee for using cloud services and thus cost of purchasing underlying hardware is saved.

Page 7: Distributed Services Scheduling and Cloud Provisioning

SaaS

Cloud Service Models

PaaS

IaaSSOFTWARE AS A SERVICE

SaaS provides clients to use a software installed on cloud via an access client or browser (web-service). Remote Desktop Virtualization is a common example of SaaS, e.g. VNCViewer. Usually priced on per-pay-use basis.

Page 8: Distributed Services Scheduling and Cloud Provisioning

Motivation for the Project

To allow user to access cloud services from “anywhere anytime” we have to make deployment of the cloud applications easy. The average utilisation of CPU and RAM in a normal users’ system is below 10%. So these end users can share their resources to the cloud and be benefited and benefit others in return. Make Scheduling of applications with efficient usage of resources and preferences. Parallel handling of requests will lead to faster scheduling of incoming requests and better utilisation of available resources. Advantages of Parallel Computing and Virtualization are very cost effective and may lead to optimum utilisation of available resources.

Page 9: Distributed Services Scheduling and Cloud Provisioning

Proposed Framework

User : Customer who uses cloud services for deploying and executing his apps. Controlling Master Node : It receives requests from the clients and where Virtual Machines (VMs) would register to share their resources. Scheduling Algorithm runs on this node. Server Database : To store requests received from end-users, their details and details of the application or commands to be executed. It also maintains the list of VMs registered and online to serve a client’s request. Scheduler : Integrated to master node and runs “dynamic priority-based weighted queue scheduling algorithm”. Working VM Nodes : These are connected to master node and execute client requests as per given by scheduler. These VMs provide their remote desktop to clients.

The proposed framework architecture has following components :

Page 10: Distributed Services Scheduling and Cloud Provisioning

User Machines

Controlling Master Node

Server Database

Working VM Nodes

Scheduler

Architecture of Framework Proposed by Us

(client requests)

Sharing of resources between

VMs

(TCP Connection initiated by VM with client for Remote Desktop Virtualization)

Page 11: Distributed Services Scheduling and Cloud Provisioning

Implementation of the Framework

Client framework This application is meant to be installed on client machines. It enables users to connect to the cloud using known IP addresses and port on which cloud applications are running. It allows users to request a remote desktop of a VM running on the cloud constrained by few requirements which are : Minimum RAM required, Minimum Hard Disk Space required, Operating System(Windows XP/Vista/7/8, Mac OS X, Ubuntu, Linux Distributions etc), Duration, Priority (higher priority has higher charges per unit time).

The proposed framework is implemented by three applications which are object-oriented and completely modular and are as follows:

Page 12: Distributed Services Scheduling and Cloud Provisioning

Implementation of the Framework

Client Framework has two main packages :

Client FrontEnd : It provides for the GUI to the client to input the various specifications and connect to the server. Client Remote Interface : It handles the remote desktop that is tunneled to it by the VM. It also records all the Mouse Click events and the Key Typed events and sends it to the remote VM.

Client FrontEnd

Page 13: Distributed Services Scheduling and Cloud Provisioning

Implementation of the FrameworkServer Framework

This application needs to run on Controlling Master Node The main tasks of this framework includes: - Listen for Client and VM requests. - Invokes a client handler thread and a VM handler thread. - On receiving a client request invokes a scheduler client enqueue thread.- On receiving the VM register request makes a new entry into DataBase by invoking Database Handler Object. - Starts a NotificationReceiver thread to periodically update the current load ( interims of CPU and RAM usage) on the registered VMs. - Scheduler Dispatcher thread dequeues appropriate request and selects corresponding VM and dispatches the job to the VM.

Page 14: Distributed Services Scheduling and Cloud Provisioning

Implementation of the FrameworkServer Framework has four main packages :

Server FrontEnd : It provides for the GUI to start the master node, view registered VMs and log details. Server Request Handler : It handles the client requests and starts Client request thread which extracts the requirements specified by the user. Server DB Handler : It handles the VM and client request database. Server Scheduler : It selects appropriate VM and dispatches client request to it.

Server FrontEnd

Page 15: Distributed Services Scheduling and Cloud Provisioning

Implementation of the FrameworkVM Framework

This application is to be installed on the VMs running on the worker nodes spawned by the hypervisors like VMWare or Oracle VirtualBox. With the help of this framework VMs can share its resources on the cloud. The main tasks of this framework includes: - Fetches system information and sends to the controlling node at periodic intervals via its Notifier thread.- Listen for dispatcher’s instructions for servicing the client request. - Invokes Remote Desktop Sender thread when a request is dispatched to it. - Send acknowledgement to the client node. - Sends the machine snapshots to the client. - Receives the Mouse Click and Key Typed events on the client-end and performs related operations on the VM whose remote desktop is assigned to it.

Page 16: Distributed Services Scheduling and Cloud Provisioning

Implementation of the FrameworkVM Framework has two main packages :

VM FrontEnd : It provides GUI to monitor the VM and to input the various specifications and to connect to the server to periodically update load related data in the server database. On receiving a client request, it tries to send an acknowledgement to the Client. VM Remote Desktop Sender Interface : It handles the remote desktop that is tunneled to the client. It also receives all the Mouse Click and the Key Typed events from the client and performs the corresponding action.

VM FrontEnd

Page 17: Distributed Services Scheduling and Cloud Provisioning

Server Node

Client node

VM Node

Server Node

Client node

VM Node

Star

t Clie

nt

Thre

ad a

nd V

M

Thre

adR

egis

ter I

nfor

mat

ion

List

en o

n Po

rt 6

060

Star

t Not

ifier

Th

read

Req

uest

VM

Enqu

eue

Req

uest

Star

t Sch

edul

er

Thre

ad

Find

app

ropr

iate

V

M

Deq

ue C

lient

R

eque

st

Send

Ran

dom

Por

t (R

)

Ope

n R

ando

m

Port

and

wai

t for

co

nnec

tion

from

V

MC

lient

IP a

nd ra

ndom

por

t R

Star

t thr

ead

to

rece

ive

stat

us

Ack

now

ledg

e re

ques

t on

port

R

Star

t Rem

ote

Rec

eive

r Int

erfa

ce

Send

Rea

dy M

essa

ge

Star

t Rem

ote

Send

er In

terf

ace

Send

Sna

psho

ts P

erio

dica

lly

Fetc

h M

ouse

Clic

k/K

ey P

ress

Eve

nts

Send

Com

man

d(C

lick

Even

ts)

Exec

ute

Rec

eive

d C

omm

and

and

Send

Sna

psho

t

Term

inat

e C

onne

ctio

n

Req

uest

Ser

vice

d

Sequence Flow Diagram

Page 18: Distributed Services Scheduling and Cloud Provisioning

Proposed Scheduling Algorithm

The framework was initially embedded with the traditional FCFS job scheduling algorithm. But as we know that the cloud infrastructure is “On Demand Pay Per Use”, we cannot rate all jobs alike. So, it failed to serve the purpose. This lead us to a priority based job scheduling algorithm where the priority can be decided on the following basis: - Jobs with higher cost per unit time are assigned higher priority than Jobs with lower cost per unit time.- Deadline constrained jobs are given higher priority than jobs whose time limits are not constrained. A simple priority based job scheduling algorithm however suffers from a flaw called STARVATION. To avoid this we have used priority based scheduling with weighted queues.

Page 19: Distributed Services Scheduling and Cloud Provisioning

Proposed Scheduling Algorithm

The proposed scheduling algorithm employs use of three weighted queues denoting three distinct levels of priority:- High Priority Job Queue with priority 1 - Normal Priority Job Queue with priority 2- Low Priority Job Queue with priority 3

Scheduler starts by invoking two threads Job_Enqueue Thread and Job_Dequeue Thread. In our framework the scheduler is encased and implemented at the master node.

Page 20: Distributed Services Scheduling and Cloud Provisioning

High Priority Queue:1

Normal Priority Queue:2

Low Priority Queue:3

Scheduler

Job_Dequeue ThreadJob_Enqueue Thread

J1

J2

J3

J4

1. Client Job Requests arrives at Scheduler.

2. Sche

duler

Starts t

his

Thread

.

3. Computes priority of Job J1 (let it be

Normal).

J1

4. Queues Job to Corresponding Queue.

J1 J1

5. Always running

and stops if the

three queues are empty.

6. Runs over the 3 Queues in RRF and dequeues 3, 2 and 1

Jobs from queues with priority 1, 2 and 3

respectively.

Helper node Routine

7. Helper node routine fetches from Server

Database ids’ of unallocated VMs that is

eligible for the job.

LoadBalancer Routine

8. LoadBalancer checks if the eligible VMs' load(free RAM, CPU usage) are present in cache. If not queries

database for load data.

J1

J1J1 J1J1

9. Finally LoadBalancer allocates Job J1 to most under-utilised VM and update it as allocated in server database.

Page 21: Distributed Services Scheduling and Cloud Provisioning

Shortcomings of Previous Model

Since there are large number of incoming requests a single cloud master node cannot be sufficient to handle all the requests simultaneously without degrading the net quality of service. Since the number of virtual machines can be very large it is not feasible to store such huge data and query the database frequently. The number of tasks in the various weighted queues differ at different instances in time. Hence we need to alter dequeue rates to adapt to the current demand. Delivery of service can be improved and Job Drop ratio can be reduced. Response time of the cloud service can be improved further.

To overcome these shortcomings we proposed a revised Framework.

Page 22: Distributed Services Scheduling and Cloud Provisioning

Revised Framework

Distributed Master Node in place of centralised master node. Hierarchal structure of nodes and clusters and distribution of computational complexity. Dequeue rates of the three queues change dynamically maintaining the priority order to meet up deadline constraints. We incorporate a system that will keep track of the jobs enqueued and try to minimise the number of jobs that go past their deadlines and hence producing reliability of service. Caching mechanism has been improved to further reduce the number of database queries. Nodes are made self-aware , i.e., they are given a choice whether to accept the job request or to forward it to a less loaded cluster.

Page 23: Distributed Services Scheduling and Cloud Provisioning

J1

J2

J3

J4

Job_Dequeue Thread

Job_Enqueue Thread

Scheduler

High Weighted Queue

Medium Weighted Queue

Low Weighted Queue

Helper node

Routine

dequeue_rate Thread update_priority Thread

Dist

ribu

ted

Inte

rfac

e to

Rec

eive

Clie

nt R

eque

sts

Windows Cache

Linux Cache

Clusters A and B

Update Cache

Routine

Update Cache

Routine

Cluster Gateway (Serves as Load

Balancer)

Sub-Clusters Within A ClusterJ2J2 J2J2

J3J3 J3J3

J1J1 J1J1

UrgentJ4 J4 J4J4

Page 24: Distributed Services Scheduling and Cloud Provisioning

J1

J2

J3

J4

Job_Dequeue Thread

Job_Enqueue Thread

Scheduler

High Weighted Queue

Medium Weighted Queue

Low Weighted Queue

Helper node

Routine

dequeue_rate Thread update_priority Thread

Dist

ribu

ted

Inte

rfac

e to

Rec

eive

Clie

nt R

eque

sts

J3J3 J3J3

UrgentJ4 J4 J4J4

J5J7J1

J6J8

J9

1. Client Job Requests arrives at Distributed

Interface.

2. Scheduler receives requests from distributed interface.

3.1. If urgent flag set, immediately sent for resource allocation.

3.2. Sch

eduler

Starts t

his

Thread.

4. Computes priority of Job J3(let it be High)

and queues it in respective queue.

Jm Jl JiJk

5. Always running

and stops if the

three queues are

empty.

6. Runs over the 3 Queues in RRF and

dequeues d1, d2 and d3 Jobs from queues with priority P1, P2 and P3 (P1>P2>P3)respectively.

4.1. Periodically computes job dequeue rate of three queues based on their priority ratio and number of

jobs present in queue.

4.2. Based on dequeue rate checks whether last queued job in a queue can be dequeued within deadline. If not updates priority of

the job and moves it to higher queue.

J1J1

J1J1

Page 25: Distributed Services Scheduling and Cloud Provisioning

Job_Dequeue Thread

Helper node

Routine

Windows Cache

Linux Cache

Update Cache

RoutineUrgent

Update Cache

Routine

Jm

Jl

Ji

Jk7. Jobs arriving at Helper Node Routine

from queues or directly from scheduler.

8. Helper node routine checks the OS bit in specification

header and queries corresponding cache for

clusters’ ids that are eligible for the Job.

8.1. If it does not get any entry corresponding to its specification it

sends a query to corresponding gateway.

8.2. Cluster returns an ID and updates same to the cache.

9. Sends the job to the subcluster Id attained from cache/gateway

with a Load bit set 1- if direct from the cache 0-if from the gateway.

Jk

Jk

Page 26: Distributed Services Scheduling and Cloud Provisioning

10. If the Load bit is 0, process request.

Jk

10.1 If it is 1 and the load on the sub-cluster > threshold_Load then the sub-cluster checks by sending the query to Gateway to find a more appropriate cluster. If any such exists then update cache with new sub-cluster id send the job to the new sub-cluster with Load bit as 0 so that it does not query again.

Jk

Update Cache

Routine

10.1.1 updates cache with new sub-cluster Id

Cluster_load_notifier Thread

Runs periodically to update current average load on sub-

cluster to its gateway.

The Gateway maintains a priority_data structure to structure the sub-clusters according to their current load so that it answer any request query in O(1) time complexity.!The Gateway also maintains the Job-id and Cluster-id mapping to check the status of jobs.

Internally the VMs register at the host machines regarding their specifiation and hosts register on sub-cluster node marked in blue circle.!Finally, it allocates jobs to the VM to serve client requests.

Runs periodically on VMs to notify current status and load on it to

its core node.

VM_notifier Thread

Page 27: Distributed Services Scheduling and Cloud Provisioning

Simulation PlotsJob Queue Rate used as an Input for Testing

Page 28: Distributed Services Scheduling and Cloud Provisioning

Simulation PlotsJobs Dequeued v/s Time Plot

Previous Model Revised Model

Page 29: Distributed Services Scheduling and Cloud Provisioning

Simulation PlotsJobs Missing Deadline v/s Time Plot

Previous Model Revised Model

Page 30: Distributed Services Scheduling and Cloud Provisioning

Simulation PlotsLoad v/s Time Plot for 5 VMs

Page 31: Distributed Services Scheduling and Cloud Provisioning

The End