distributed services scheduling and cloud provisioning
DESCRIPTION
This is the presentation for my final year project at NIT Allahabad (2013-14). The purpose of the project is to design a scheduling algorithm for cloud environment with proper resource management.TRANSCRIPT
Distributed Services Scheduling and Cloud Provisioning
Project Mentor - Mr. Shashank Srivastava
Department of Computer Science and Engineering Motilal Nehru National Institute of Technology Allahabad, Allahabad
Project Members
Name Registration Number
Prashant Mishra 20105094
Nishant Narang 20104085
Sonu Goel 20103030
Trishala Saini 20104119
Arjit Agarwal 20094044
Introduction to Cloud Computing?
In Computer Science, Cloud Computing is a synonym for distributed computing over a network, and means the ability to run a program or application on many connected computers at the same time , i.e. , network-based services, which appear to be provided by real server hardware, and are in fact served up by virtual hardware, simulated by software running on one or more real machines.
Clients
Control Node
DataBase(Storage)
Computer Network
Application Servers
A Simplified Cloud Computing Architecture
Cloud
SaaS
Cloud Service Models
PaaS
IaaSINFRASTRUCTURE AS A SERVICE
IaaS is the delivery of technology infrastructure as an on demand scalable service. Offers resources like virtual machine disk image library, block or file-based storage, firewalls, load balancers, virtual LANs, Software bundles, etc. Cloud providers typically charge the customers IaaS services on the amount of resources allocated and consumed.
SaaS
Cloud Service Models
PaaS
IaaSPLATFORM AS A SERVICE
PaaS providers make available to the client a computational platform including typically an OS, programming language, execution environment, etc. Clients need to pay only a nominal fee for using cloud services and thus cost of purchasing underlying hardware is saved.
SaaS
Cloud Service Models
PaaS
IaaSSOFTWARE AS A SERVICE
SaaS provides clients to use a software installed on cloud via an access client or browser (web-service). Remote Desktop Virtualization is a common example of SaaS, e.g. VNCViewer. Usually priced on per-pay-use basis.
Motivation for the Project
To allow user to access cloud services from “anywhere anytime” we have to make deployment of the cloud applications easy. The average utilisation of CPU and RAM in a normal users’ system is below 10%. So these end users can share their resources to the cloud and be benefited and benefit others in return. Make Scheduling of applications with efficient usage of resources and preferences. Parallel handling of requests will lead to faster scheduling of incoming requests and better utilisation of available resources. Advantages of Parallel Computing and Virtualization are very cost effective and may lead to optimum utilisation of available resources.
Proposed Framework
User : Customer who uses cloud services for deploying and executing his apps. Controlling Master Node : It receives requests from the clients and where Virtual Machines (VMs) would register to share their resources. Scheduling Algorithm runs on this node. Server Database : To store requests received from end-users, their details and details of the application or commands to be executed. It also maintains the list of VMs registered and online to serve a client’s request. Scheduler : Integrated to master node and runs “dynamic priority-based weighted queue scheduling algorithm”. Working VM Nodes : These are connected to master node and execute client requests as per given by scheduler. These VMs provide their remote desktop to clients.
The proposed framework architecture has following components :
User Machines
Controlling Master Node
Server Database
Working VM Nodes
Scheduler
Architecture of Framework Proposed by Us
(client requests)
Sharing of resources between
VMs
(TCP Connection initiated by VM with client for Remote Desktop Virtualization)
Implementation of the Framework
Client framework This application is meant to be installed on client machines. It enables users to connect to the cloud using known IP addresses and port on which cloud applications are running. It allows users to request a remote desktop of a VM running on the cloud constrained by few requirements which are : Minimum RAM required, Minimum Hard Disk Space required, Operating System(Windows XP/Vista/7/8, Mac OS X, Ubuntu, Linux Distributions etc), Duration, Priority (higher priority has higher charges per unit time).
The proposed framework is implemented by three applications which are object-oriented and completely modular and are as follows:
Implementation of the Framework
Client Framework has two main packages :
Client FrontEnd : It provides for the GUI to the client to input the various specifications and connect to the server. Client Remote Interface : It handles the remote desktop that is tunneled to it by the VM. It also records all the Mouse Click events and the Key Typed events and sends it to the remote VM.
Client FrontEnd
Implementation of the FrameworkServer Framework
This application needs to run on Controlling Master Node The main tasks of this framework includes: - Listen for Client and VM requests. - Invokes a client handler thread and a VM handler thread. - On receiving a client request invokes a scheduler client enqueue thread.- On receiving the VM register request makes a new entry into DataBase by invoking Database Handler Object. - Starts a NotificationReceiver thread to periodically update the current load ( interims of CPU and RAM usage) on the registered VMs. - Scheduler Dispatcher thread dequeues appropriate request and selects corresponding VM and dispatches the job to the VM.
Implementation of the FrameworkServer Framework has four main packages :
Server FrontEnd : It provides for the GUI to start the master node, view registered VMs and log details. Server Request Handler : It handles the client requests and starts Client request thread which extracts the requirements specified by the user. Server DB Handler : It handles the VM and client request database. Server Scheduler : It selects appropriate VM and dispatches client request to it.
Server FrontEnd
Implementation of the FrameworkVM Framework
This application is to be installed on the VMs running on the worker nodes spawned by the hypervisors like VMWare or Oracle VirtualBox. With the help of this framework VMs can share its resources on the cloud. The main tasks of this framework includes: - Fetches system information and sends to the controlling node at periodic intervals via its Notifier thread.- Listen for dispatcher’s instructions for servicing the client request. - Invokes Remote Desktop Sender thread when a request is dispatched to it. - Send acknowledgement to the client node. - Sends the machine snapshots to the client. - Receives the Mouse Click and Key Typed events on the client-end and performs related operations on the VM whose remote desktop is assigned to it.
Implementation of the FrameworkVM Framework has two main packages :
VM FrontEnd : It provides GUI to monitor the VM and to input the various specifications and to connect to the server to periodically update load related data in the server database. On receiving a client request, it tries to send an acknowledgement to the Client. VM Remote Desktop Sender Interface : It handles the remote desktop that is tunneled to the client. It also receives all the Mouse Click and the Key Typed events from the client and performs the corresponding action.
VM FrontEnd
Server Node
Client node
VM Node
Server Node
Client node
VM Node
Star
t Clie
nt
Thre
ad a
nd V
M
Thre
adR
egis
ter I
nfor
mat
ion
List
en o
n Po
rt 6
060
Star
t Not
ifier
Th
read
Req
uest
VM
Enqu
eue
Req
uest
Star
t Sch
edul
er
Thre
ad
Find
app
ropr
iate
V
M
Deq
ue C
lient
R
eque
st
Send
Ran
dom
Por
t (R
)
Ope
n R
ando
m
Port
and
wai
t for
co
nnec
tion
from
V
MC
lient
IP a
nd ra
ndom
por
t R
Star
t thr
ead
to
rece
ive
stat
us
Ack
now
ledg
e re
ques
t on
port
R
Star
t Rem
ote
Rec
eive
r Int
erfa
ce
Send
Rea
dy M
essa
ge
Star
t Rem
ote
Send
er In
terf
ace
Send
Sna
psho
ts P
erio
dica
lly
Fetc
h M
ouse
Clic
k/K
ey P
ress
Eve
nts
Send
Com
man
d(C
lick
Even
ts)
Exec
ute
Rec
eive
d C
omm
and
and
Send
Sna
psho
t
Term
inat
e C
onne
ctio
n
Req
uest
Ser
vice
d
Sequence Flow Diagram
Proposed Scheduling Algorithm
The framework was initially embedded with the traditional FCFS job scheduling algorithm. But as we know that the cloud infrastructure is “On Demand Pay Per Use”, we cannot rate all jobs alike. So, it failed to serve the purpose. This lead us to a priority based job scheduling algorithm where the priority can be decided on the following basis: - Jobs with higher cost per unit time are assigned higher priority than Jobs with lower cost per unit time.- Deadline constrained jobs are given higher priority than jobs whose time limits are not constrained. A simple priority based job scheduling algorithm however suffers from a flaw called STARVATION. To avoid this we have used priority based scheduling with weighted queues.
Proposed Scheduling Algorithm
The proposed scheduling algorithm employs use of three weighted queues denoting three distinct levels of priority:- High Priority Job Queue with priority 1 - Normal Priority Job Queue with priority 2- Low Priority Job Queue with priority 3
Scheduler starts by invoking two threads Job_Enqueue Thread and Job_Dequeue Thread. In our framework the scheduler is encased and implemented at the master node.
High Priority Queue:1
Normal Priority Queue:2
Low Priority Queue:3
Scheduler
Job_Dequeue ThreadJob_Enqueue Thread
J1
J2
J3
J4
1. Client Job Requests arrives at Scheduler.
2. Sche
duler
Starts t
his
Thread
.
3. Computes priority of Job J1 (let it be
Normal).
J1
4. Queues Job to Corresponding Queue.
J1 J1
5. Always running
and stops if the
three queues are empty.
6. Runs over the 3 Queues in RRF and dequeues 3, 2 and 1
Jobs from queues with priority 1, 2 and 3
respectively.
Helper node Routine
7. Helper node routine fetches from Server
Database ids’ of unallocated VMs that is
eligible for the job.
LoadBalancer Routine
8. LoadBalancer checks if the eligible VMs' load(free RAM, CPU usage) are present in cache. If not queries
database for load data.
J1
J1J1 J1J1
9. Finally LoadBalancer allocates Job J1 to most under-utilised VM and update it as allocated in server database.
Shortcomings of Previous Model
Since there are large number of incoming requests a single cloud master node cannot be sufficient to handle all the requests simultaneously without degrading the net quality of service. Since the number of virtual machines can be very large it is not feasible to store such huge data and query the database frequently. The number of tasks in the various weighted queues differ at different instances in time. Hence we need to alter dequeue rates to adapt to the current demand. Delivery of service can be improved and Job Drop ratio can be reduced. Response time of the cloud service can be improved further.
To overcome these shortcomings we proposed a revised Framework.
Revised Framework
Distributed Master Node in place of centralised master node. Hierarchal structure of nodes and clusters and distribution of computational complexity. Dequeue rates of the three queues change dynamically maintaining the priority order to meet up deadline constraints. We incorporate a system that will keep track of the jobs enqueued and try to minimise the number of jobs that go past their deadlines and hence producing reliability of service. Caching mechanism has been improved to further reduce the number of database queries. Nodes are made self-aware , i.e., they are given a choice whether to accept the job request or to forward it to a less loaded cluster.
J1
J2
J3
J4
Job_Dequeue Thread
Job_Enqueue Thread
Scheduler
High Weighted Queue
Medium Weighted Queue
Low Weighted Queue
Helper node
Routine
dequeue_rate Thread update_priority Thread
Dist
ribu
ted
Inte
rfac
e to
Rec
eive
Clie
nt R
eque
sts
Windows Cache
Linux Cache
Clusters A and B
Update Cache
Routine
Update Cache
Routine
Cluster Gateway (Serves as Load
Balancer)
Sub-Clusters Within A ClusterJ2J2 J2J2
J3J3 J3J3
J1J1 J1J1
UrgentJ4 J4 J4J4
J1
J2
J3
J4
Job_Dequeue Thread
Job_Enqueue Thread
Scheduler
High Weighted Queue
Medium Weighted Queue
Low Weighted Queue
Helper node
Routine
dequeue_rate Thread update_priority Thread
Dist
ribu
ted
Inte
rfac
e to
Rec
eive
Clie
nt R
eque
sts
J3J3 J3J3
UrgentJ4 J4 J4J4
J5J7J1
J6J8
J9
1. Client Job Requests arrives at Distributed
Interface.
2. Scheduler receives requests from distributed interface.
3.1. If urgent flag set, immediately sent for resource allocation.
3.2. Sch
eduler
Starts t
his
Thread.
4. Computes priority of Job J3(let it be High)
and queues it in respective queue.
Jm Jl JiJk
5. Always running
and stops if the
three queues are
empty.
6. Runs over the 3 Queues in RRF and
dequeues d1, d2 and d3 Jobs from queues with priority P1, P2 and P3 (P1>P2>P3)respectively.
4.1. Periodically computes job dequeue rate of three queues based on their priority ratio and number of
jobs present in queue.
4.2. Based on dequeue rate checks whether last queued job in a queue can be dequeued within deadline. If not updates priority of
the job and moves it to higher queue.
J1J1
J1J1
Job_Dequeue Thread
Helper node
Routine
Windows Cache
Linux Cache
Update Cache
RoutineUrgent
Update Cache
Routine
Jm
Jl
Ji
Jk7. Jobs arriving at Helper Node Routine
from queues or directly from scheduler.
8. Helper node routine checks the OS bit in specification
header and queries corresponding cache for
clusters’ ids that are eligible for the Job.
8.1. If it does not get any entry corresponding to its specification it
sends a query to corresponding gateway.
8.2. Cluster returns an ID and updates same to the cache.
9. Sends the job to the subcluster Id attained from cache/gateway
with a Load bit set 1- if direct from the cache 0-if from the gateway.
Jk
Jk
10. If the Load bit is 0, process request.
Jk
10.1 If it is 1 and the load on the sub-cluster > threshold_Load then the sub-cluster checks by sending the query to Gateway to find a more appropriate cluster. If any such exists then update cache with new sub-cluster id send the job to the new sub-cluster with Load bit as 0 so that it does not query again.
Jk
Update Cache
Routine
10.1.1 updates cache with new sub-cluster Id
Cluster_load_notifier Thread
Runs periodically to update current average load on sub-
cluster to its gateway.
The Gateway maintains a priority_data structure to structure the sub-clusters according to their current load so that it answer any request query in O(1) time complexity.!The Gateway also maintains the Job-id and Cluster-id mapping to check the status of jobs.
Internally the VMs register at the host machines regarding their specifiation and hosts register on sub-cluster node marked in blue circle.!Finally, it allocates jobs to the VM to serve client requests.
Runs periodically on VMs to notify current status and load on it to
its core node.
VM_notifier Thread
Simulation PlotsJob Queue Rate used as an Input for Testing
Simulation PlotsJobs Dequeued v/s Time Plot
Previous Model Revised Model
Simulation PlotsJobs Missing Deadline v/s Time Plot
Previous Model Revised Model
Simulation PlotsLoad v/s Time Plot for 5 VMs
The End